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ON THE NORMAL APPROXIMATION TO THE BINOMIAL 
DISTRIBUTION 


By W. FELLER 
Cornell University 


1. Although the problem of an efficient estimation of the error in the normal 
approximation to the binomial distribution is classical, the many papers which 
are still being written on the subject show that not all pertinent questions have 
found a satisfactory solution. Let for a fixed n andO <p <1,qgq=1-—p, 


(1) Ti = (7) pq’, Py, = DT: 


k=A 


For reasons of tradition (and, apparently, only for such reasons) one sets 


1/2 
’ 


(2) za = (k — np)o', o = (npg) 


and compares (1) with 


(3) Ny = (29)? a ce” and Mh» = o( y+ x) _ (a — x) 

20 20 
respectively, where ®(z) stands for the normalized error function. Many 
estimates are available for the maximum of the difference | P,,, — Tl, | for all A, v. 
Now this error is O(o *) and even a precise appraisal will break down in the two 
most interesting cases: if o is small, or if \ and » are large as compared to oc. 
Indeed, even for moderately large values of k (such as are usually considered) 
the contribution of 7; to the sum in (1) will be considerably smaller than o 
so that any estimate of the form O(c ') leaves us without guidance. With some 
modifications this remains true also for more refined estimates like Uspensky’s 
remarkable result” 


(4) Pry = The + goin — Ae | HA + w 


with 
jo | < {13+ 18|p—ql}o? + eo” 


provided ¢ > 5. What is really needed in many applications is an estimate of 
the relative error, but this seems difficult to obtain. 

It should also be noticed that the accuracy of the normal approximation to the 
binomial is by no means quite as good as many texts would make appear. Exam- 


1 1 
1 Very often the limits z, and z, instead of z, + oe and z, — 3, are used. This naturally 


results in an unnecessary systematic undervaluation. 
2 Uspensky [3], p. 129. A two-term development of 7', with an error of O(o~?) valid for 
| 2 | < 2,0 > 3 has been given by Mirimanoff and Dovaz [1927]. 
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ples using p = 3 and intervals which are symmetric with respect to np are hardly 
conclusive, since there the main error term drops out and systematic positive 
and negative errors cancel. Again, in practice comparatively small « and com- 
paratively large v are frequently used. It works well to compare a P,,, of a 
numerical value, say, .93 with a corresponding value II,,, of, say, .95. In class- 
room discussions the error may seem insignificant. However, in most actual 
applications one would consider the complementary probabilities, and the very 
same figures mean an approximation .05 to the correct value .07. If a confidence 
limit is set to the five per cent level, the normal approximation would in our 
example mean that two out of seven critical cases are missed. Consider next the 
example p = 35, n = 10,000. For values of k around 1120 the relative error of 
N;, is about .30; it increases rapidly with increasing k. . Around k = 1150 the 
relative error exceeds 2/3, around 1180 it is nearly 1.4. And yet this example 
is conservative in comparison with many cases where the normal approximation 
is used in practice. 

It is surprising that the classical norming (2) is generally accepted although 
there does not seem to exist any deeper reason for it. The use of moments, 
though usually very convenient, does not necessarily lead to best results. For 
example, the density function 


. 1 n —r 
(5) jf.(z) = te 


is the (n + 1)-fold convolution of fo(x) with itself and therefore, for large n, 
of nearly normal “type.” The conventional norming would approximate 
fr(x) by {2a(n + 1)} 12 et FD 2D | hile the use of the norming factor n 
instead of (n + 1) seems clearly indicated. 


Actually, as will be seen, it is natural (at least for small values of k — np) 
to replace (2) by 
(6) ty = {k +3 — (mn + Ipjo™, 
and accordingly to approximate P,,, by the error integral taken between the limits 
(7) ‘~X— (n+ 1)pjo and {v+1— (n+ 1)p}o™. 


For example, let p = 75, n = 500, \ = 50, vy = 55. The correct value is P5055 
.317573; the norming (2) leads to ITso,55  .32357, while the more natural limits 
(6) lead to an approximation .31989. More important are the quite unexpected 
simplifications which the norming (6) permits when one studies the error for 
large 2, or small o. 

We are now led to reformulate the problem: instead of starting with arbitrary 
limits for the error integral and to estimate the resulting error, we shall try to determine 
the limits so as to minimize the error. Theoretically, for any given \, v these limits 
could be determined so as to give an exact value for P,,,. However, such limits 
would depend in the most intricate way on \ and v. For practical purposes one 
would restrict the considerations to certain simple functions such as polynomials. 
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We shall here consider only the case where the limits are at most quadratic 
polynomials. Essentially our problem seems that treated by Serge Bernstein 
(and, apparently, only by him). In a series of papers since 1924, S. Bernstein 
has considered the accuracy of the normal approximation. Quite recently” 
he has, by a considerable computational effort, extended the range of validity 
from npg > 365 to npg > 62.5 and proved the following 

THEOREM (S. Bernstein): Let 


(8) npq > 62.5 


and let a; , Bz be the solutions of the quadratic equations 


a — 3 — np = az(npq)” + q a a: 
(9) 
r+}—np= B.(npq)'” +B q 7“ P B.. 
If 
(10) a > 0, B < 2"*(npq)'* 
then 
(11) #(8,) — &(8,) S Py, < P(a,) — Play). 
The conditions (10) are practically equivalent to 
(12) > np + 3, y<npt+2'%'". 


The remarkable feature of this excellent result is that the error remains O(o °) 
throughout an interval which increases with o (instead of the conventional uni- 
formly bounded intervals). 

In the sequel it will be shown that startling simplifications can be obtained if 
the norming (6) is used from the beginning instead of (2). Our main result is an 
improvement of S. Bernstein’s theorem. The condition (8) will be replaced by 
(n+ 1)pq >9. The first condition in (10) will be relaxed to k > (n + 1)p, that 
is to say, our theorem will hold for all k exceeding the central value (for those less 
than the central value an analogous theorem holds); in the other condition (10), 
the numerical value 2”” will be replaced by an arbitrary constant. Instead of 
quadratic equations, we shall consider quadratic polynomials. And finally, the 
gap between the two sets of limits will be reduced. 

It will be seen that the computations leading to this improvement are almost 
negligible in comparison with S. Bernstein’s deeper method; with slightly more 
sophisticated arguments and numerical evaluations, our results can be con- 
siderably improved. Our consideration will be based on a new expression for 
T; , in which only exponential terms appear but the usual square root is missing. 


3S. Bernstein [1], the first paper of the series appears to have appeared in U¢enye Zapiski, 
Kiev, 1924. 
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In passing from approximations to 7; to approximations to P,,, one has to 
replace sums by integrals. This procedure is cumbersome if an estimate of the (s 
relative error is desired. Euler’s formula and other standard formulas are of 
little use. We shall therefore start with a lemma which, it is hoped, may be 
useful in this connection; it will therefore be proved in a slightly more general 
form than actually required for the present paper. 
2. Lemma‘ 1. For0 <h <}and|zh| <1 


z+h/2 1 
(13) eo du = “oer ‘ 
z—h/2 
with | 
4 
zt 1 
14 ~* e,<1 
“ 880 = ° = 285 


Proor. Denote the integral in (13) by J. Then 





h/2 h/2 
(15) hie? y = h [ ett ls ae me Dh [ chate*”'? dt. 
h/2 0 
We begin by showing that for 0 < a < 3 
(16) et leaf < che < emrlt—atle 
In fact 
4/11 a at F 2 
* = — aaa > ea 
(17) e cha > (1+ 5+ Z\(1+ =) 14248 <> (Zz) 26 ; 
and 
on oo af 4 at 1 
a Pars > Keel oe ies aa > = wn > : 
(18) € >(1 + 5 a V1 =) 1 +3 +2 a ft cha 
120 
It follows from (15) and (16) that 
hie 27 > Qn i elt? —D Pi 6— 2A 04155 | (x21) 62/34 804/85 dt 
= 0 
(19) > Qh [- et? 6-24 04/55 .+ 1 f a | dt 
0 3 55 


—1 2—1)¢2/6—x4t4/55)h/2 
= 2h [te lo 


which proves one part of the lemma. 
To obtain an upper estimate we make use of the inequalities 


2 2 
21) ¢2/ Zt —12/3+-7444/ 
e* 1)t2/3 < ( + ye t2/3+24t4/18 


4 The fraction 3 is chosen quite arbitrarily; if h be restricted toO0 < h < 1 the first member 


1 1 
of (14) remains unchanged, while the fraction 385 on the right side has to be replaced by 364° 
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,.£ 
2 2 2 1-.~. +— 
at O\ sse4/18 3° 6B 
<{1 a ie ae _ 3 
(20) < ( 7 \Q 5 a, 
3 
x -— j 444) 4,988 
< ( a — ‘) etittl18+h4/ 286 


Using (16) and (20), the proof of the second part of the lemma follows from a 
computation analogous to (19). 

For our purposes it is convenient to use Stirling’s formula in a form which is 
not quite the usual one. 

LemMA 2. (Stirling’s formulas). For n > 4, 


(21) n! = (2r)*(n + gyi 1 nee 
or 

(22) nt= (Ba) a? tt ge ttiee-o tne? 

where 

(23) | 3: | <%, v—0 as n-ow, 


Formula (21) can be derived from the gamma function or in any other way 
that leads to the standard form (22).° 
3. From now on we shall put 


(24) c= (n + 1)pq 
(25) ae = {k +4 — (n+ l)pjo"; 


the subscript k will be omitted whenever no confusion is to be feared. jr o trans- 
form 7; we shall use (21) for the factorials in the denominator, but (22) for 
(n + 1)! in the numerator. 


5 A simple proof runs as follows. Put B, = n!(n + 9)~-@*Pert#tU/4@tp, Then 


Joe Be = 1 1 1 7146 
QO —_ - — a —— = gy «= = 
© B,  4\6 (2 + 1) Op) 60 (2p)! 





1 
with 0 < 5, < 70 ifp = 5. From here (21) follows using the fact that 
é 


a Bp-1 
>, log =~ = log B, — 4 log (2x) 
p=n+l B, 
and that for n = 4 
1—6 om o 1 
————__ < - < ——_ 
3(n + 4)3 2d 4 ~ 3(n + 4) 


; 3 ; ; ; : 
withO <5 < = In this way the estimate (23) can be considerably improved. 
5 
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Then 


1 ° 1 
log ((27)’ oT.) = (n + 1) log (n + 1) — (k + 3) ie + 
Pp 








.—k+3 l 1] 
— 1) log n 2 
” + 2) log 7 T n+ 1) 1 24k + 1) 
] 
(26) T 24(n —%: a * 


+ 
= —° (: + #) loe (1 +2) - £(1 “ =) loe(1 “ ) 
q o o Pp o o 





7 f | ( | 
O<p< ea : ai - me 
(27) ao? \360(n + 1) r 2880 Ee + 3)° (n—k+ 5} 
7 1 fas 7 31 os 
< -:- <4 ) 
S 6 36008 \? 4 + gl? + 9), 
provided only that k > 4, (n — k) > 4. Asymptotically p is equivalent to the 
right-hand member without factor § (which, by the way, could be replaced by 
1+ 25). Obviously 





(28) O<p< 


| 
30008 ’ 
ifk > 4,n — k > 4. We shall consider later on the case ¢ > 3, |a2| < 3; 
then clearly k > 4,n — k > 4, so that the use of (28) will be justified. Expand- 
ing (26) into a power series we obtain 

THreoreM. [fk > 4,n —k > 4, 


( ce: a 

T, = (29) ag exp = Zz p ( q) a 

2 viv — 1) eo’ 
(29) ai 7 | 
antag ee LF. g) 
+ 94g 2 bP q) 3 (*) : 


where p satisfies (28) (and (27)); x and o are defined by (25) and (24), respectively. 

Each term of the second series avill usually be small as compared to the cor- 
responding term of the first series; the second series can therefore, if desired, be 
absorbed in the error term. If x is small the first term of the first series will be 
preponderant. However, as x increases, more and more terms will make them- 
selves noticeable; if « ~ o'”, three terms will be essential, and so on. 

Formula (29) permits us to approximate P,,, by means of integrals. The 
tangent rule would suggest to compare P\,, to 


(30) & («. 4 ) a‘ (:, ~ 2} 
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and (29) together with lemma (1) permits easily to estimate the relative error 
in the practically most important cases. It is also seen that the limits in (30) 
are essentially the only limits depending linearly on \ and v which will render the 
relative error O(o ') for x = O(1). Instead of elaborating on these simple 
questions we proceed to the more intricate problem of limits which are quadratic 
polynomials in \ and ». 

4. For brevity we shall from now on put 
(31) o_* = 

6 

The estimate | a) S § will be used constantly. It obviously suffices to consider 
values of \ < v which exceed the central value [(m + 1)p]. 

THEOREM. Suppose that 
(32) ¢>3 
and 


(33) A> (n+ 1p vt+3<(n4+1)pt eo. 
Then 

(34) Pyy < @ PO? t(n,45) — &(m)}, 

if 

(35) ees Np 4 #{& = (n + 1)p\ 4% 


9? 
a o | a 20° 


a. 


while the inequality in (34) is reversed if 
. 


i a. ue ) cee \2 1 
im got eee ee SUP oy Z +s. 
o o o 60 io 


where 
a | , — 1)p}* 
(37) eo 222 oe. 
oO 
The gap between the limits (35) and (36) is O(o ‘) if rt, = O(c). InS8. Bern- 
stein’s case (12), 1 < +/2 and the gap is about 2/(5c). It will be seen from 
the proof that it requires only routine computations to improve the correction 


M 4. 
term ‘z _ i, in (36). 


Proor. Put 
(38) t, = 2 + —2i, 
oC 


again suppressing the subscripts wherever convenient. As a consequence of 
(33), we shall be concerned only with values x; satisfying 
] 


, 2 
3y sw i oe, 
(39) - < 2 3” 
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Consider first the main series in (29) and write 








—1 
tC -i-v f a 
(40) . >. i 4¢ + A, 
where 
, pt+¢d_«\:, Sp -(-@— 
(41) “* ( 12 a + 2 pv — 1) = 


We shall require some estimates of A. First consider the casea > 0. Then all 
terms of the series are positive, while the oe within parentheses assumes 
its minimum 7; for p = 3. By (39) — < 4° x, whence 


| i”. 
(42) A> * if a>0. 


If a < 0 the signs in the series (41) alternate, each negative term being smaller 
in absolute value than the preceding positive term. Therefore, using (39), 


+q@_a@_q—-p 
(43) Aziet 2 30 ‘ 


aa 


o 


The expression within braces is a cubic in p which assumes its minimum for p = 
(1 + +/793)/72 = .405....It follows that 


4 
(44) a>nivy1é 


SP ow 
=pmae ee as 


(half of this estimate would actually suffice, for our purposes). On the other 

hand, it is evident from (41) that the ratio A/2* attains its maximum for p = 1. 

Therefore, using (39) 

(45) A< 
Next we write 

(46) 

whence . 

p + q ele 1 < ; 1 (2 y—2 

- = 1 amet eme a ee Tae p=. a a a. 

(47) B | a4 it opa bt (—@) 1 (2) 


A trivial computation analogous to (48) shows that B > 0. Again, if a < 0, 
the signs in the series (47) alternate and in this case 


3 3 2 ee 
(48) o<B<3|? =< -*)£<5 e eet, 
ot 








12 2 |o*~ 144 ot 20 
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If a > O we can majorize (47) by a geometric series and obtain 


le 1% 
. © 6  & oo 
(49) O38 8354355 
Now put 
(50) Ag, = " ( + = ws) ‘ 
Then 
(51) & + ZAE. — En44 a 5 Agius 


so that the intervals with endpoints & -+ }Aé are non-overlapping and con- 
tiguous. Clearly 


1/2 
(52) At = oi +23 ; 
Introducing (40), (46), and (52) into (29) we obtain 


T, = (ny **ag-exp} —§ -A+B+4¢- Hog (1 +2) 
40° o 


o 
(53) ve 
PY _ 
T+ Dae ob. 

To appraise the logarithmic term we write 
(54) jtog(1 + ‘<*) ao ~¢. 

o o 
C ¢° attains its maximum value when a = — }, and it is readily seen that 

i 

0<C< “s if a>0 

(55) 


< 
o<c< St it aco. 


co 





Finally we put, with a parameter wu to be determined, 


2a — u 


(56) oes Ay = Aé. 
If one puts 
1 a 


and 7 is defined by (35), then 


(58) Ye + ZOye = mer, Ye — GAY = |. 
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On the other hand, if 
(59) 


and nx is defined by (36), the identities (58) hold again. Accordingly, all we have 
to show is that, with u defined by (57), 
(60) Te S (®(ye + 4Ayx) — (ye — SAY) PPM 


and that the inequality in (60) is reversed if u is defined by (59). 
Elementary transformations lead from (53) to 


(61) Ty = (2r) ‘Ay: exp, —¥ = (y? — 1) a" 5 — Pa) + Bh, 
2 o 
where 


_ dau u 5a 1 4aé\ » 
62 ‘+. t= Be GME. ans, Se Bese 1 — A 3 C — p. 
= Qo? & =.) - 240° ( T o yy “ee , 


Let now wu be defined by (57). In view of lemma 1 and (61), the inequality 
(60) will be proved if we show that 


; *(Ay)* 
EB, =E+4% <0. 
(63) we 
Now clearly 

, dag y (dy) , y*(Ay)' 
64 > 
ae ae a() sd st) 24 — 880 


Moreover, introducing the estimates (28), (32), (42), (44), (48), (49), and (55) 
into (62) it is seen that for a > 0 
i 1 25 1 42 Bist 

5 Ex = — =- —— 
(65) oe aS 
and fora < 0 





2 2m 2 1 142 ., 
(66) ‘ms a6 + 5& ~ 69° 


The derivatives of the right-hand members in (65) and (66) are both negative 
for — > 0. Now we are interested only in values x satisfying (39). For such 






= 


107 
ralues § > -——. pe = 2 ight-he bers in (65) ¢ 66) ar 
values § > 2160 For & 2160 the right-hand members in (65) and (66) are 


] 

negative, so that E; < 0 for x > 20° This proves the first. part of our theorem. 
o 

The proof that with (59) the inequality in (60) is reversed proceeds on similar 

lines. We have to show that 






(67) =E-‘ 

















yom FL) 
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Suppose that a < 0, which is the less favorable case. Then, by (45), (37), and 
(39), 


(68) sass 


"aes we 


Similarly 


, 1 4agt F 
7 240° (+% yy < stal *) 


Using (62) we have therefore, neglecting the non-negative terms B and C, 


wo - 1 

2404 3302 ~ 25004 

5 _ 3M SS = 1 e 
206 120° 2407 * 


(70) 

2 4.0 
The expression at the right side represents a parabola, and it suffices to show that 
it assumes positive values at the endpoints of our interval (39). Now 


uw 1 u 1 1 6 
Sam ee A ie ee 
(71) ( i) 3-18 1 7 ~ 407’ 
i <ign 
120° 


and simple arithmetic shows that, with (59) the expression within the braces 
more than counterbalances the negative terms outside.” If a > 0 the situation 
is more favorable and the estimate (59) can then be further improved. 
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THE VARIANCE OF THE MEASURE OF A TWO-DIMENSIONAL RANDOM 
SET 








By J. Bronowskt AND J. NEYMAN 
Princes Risborough, England and the University of California 


1. Introduction. In a recent paper H. E. Robbins’ has solved the problem 
of the variance of the measure of a one-dimensional random set. The present 
paper treats a similar problem relating to a two-dimensional random set under 
somewhat more general conditions. 

Let R denote a rectangle of dimensions a & b whose position is fixed. Let R’ 
denote another fixed rectangle concentric with R, its sides a + y and b + y (where 
y > 0) being parallel to the sides a and b respectively of R. Finally, let p denote 
a rectangle of fixed dimensions but variable position, whose sides a < 2y and 
B < 2y are parallel to a and b respectively, but the position of whose center will 
be considered as random. In fact it will be assumed that the rectangle p is 
dropped on the plane of R in a manner which satisfies the following two 
assumptions: 

(i) The probability that the center of p falls within R’ exactly s times has a 
defined value P, for each s = 0, 1, 2, --- Thus, if W(w) denotes the probability 
generating function of s, so that 


oo 


(1) V(u) = Dw P,, 
s=0 
then W(x) is assumed known but will be left arbitrary till the general result is 
obtained. 

(ii) Whenever a fixed number s of centers of p fall within R’, it will be assumed 
that the probability that exactly k centers of p fall within any chosen sub-area w 
contained in #2’ is given by the binomial expression 


s! ={ w\** 
(2) k's — k)! R’ sii =) 


Under the above conditions, denote by E the set of all those points of R which 
are covered at least once by the rectangle p during the course of the trials con- 
sidered. Let X denote the measure of EL. The purpose of this paper is to 
evaluate the first two moments of X. 

First, the computations will be made for the case when sg is fixed, i.e. when 
(3) V(u) = w. 

The values of the two moments of X computed for fixed s will be denoted by 
M,(a,b|s) and M,(a,b\s). Next, the moments of X will be evaluated for an 


arbitrary generating function V(u), and these will be denoted by M,(a,b) and 
M,(a, b). 


1H. E. Ropsrns, “On the measure of a random set’’, Annals of Math. Stat., Vol. 15 
(1944), pp. 70-74. 
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H. E. Robbins has found the first moment 


(4) M,(a, b|s) = a{1 - (1 “ fy 


Also, for a one-dimensional set, he has obtained the second moment, say M,(a\s), 
when a < a. 


It follows immediately from (4) and (1) that, whatever be the probability 
generating function V(u), 


(5) M,(¢, b) = a1 oe ¥( “ ) 


In particular, if the probabilities P, are those of Poisson when the density of 
positions of the center of p per unit of area is \, so that 


(6) V(u) - gRe-a 


then 


? 


(7) M,(a,b) = ab{1 — & ™} 


Our remaining problem, therefore, is that of evaluating the second moment of 
X. Instead we shall evaluate the second moment of 


(8) Y = ab — X, 
and shall denote it by m(a, b | s) or m(a, b) according as s is or is not considered 


to be fixed. 


2. Derivative of the second moment of Y. In order to evaluate m(a, b), we 
begin by calculating its second (mixed) derivative, say D(a, b | s), where 


a’ m(a, b | s) 


D(a, b| s) = aaah 


hee {m(a + Aa, b + Ab|s) — m(a, b + Ab| s) 
AaAb 
(9) Aa,Ab-+0 


— m(a + Aa, b|s) + m(a, b| s)} 


= lim ~ =a I(Aa, Ab) (say), 
where Aa and Ab are the increments of a and b respectively. Once D(a, b | s) 
is found, the formula for m(a, b | s) will be obtained by two quadratures. For 
definiteness we shall assume Aa and Ab both to be positive, but of course the 
argument which follows applies equally to other cases. 

Consider the rectangle of dimensions (a + Aa) and (b + Ab) as shown in Figure 
1, and denote by U, V and W the measures of the “‘uncovered”’ parts of the three 
rectangles Aa X b, a X Ab, and Aa X Ab respectively. That is to say, U, V 
and W are defined with respect to these three rectangles precisely in the same 
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manner in which Y is defined with respect to the original rectangle a X b = 
R. Using the letter E to denote the expectation, we easily find that. 

I(Aa, Ab) = 2E(YW) + 2E(UV) 
(10) 2 
+ 2E(VW) + 2E(UW) + E(W’). 


However, each of the three expectations in the second line of formula (10) is 
infinitesimal of an order higher than the product AaAb. In fact, none of the 
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variables U, V and W can exceed the area of the rectangle of which it forms part; 
that is, 
J < bAa, 
(11) <V aAb, 
; AaAb. 

It follows that 

< E(UW) < b(Aa)*Ad, 
(12) < E(VW) < ada(Ab)’, 

< E(W’) (AaAb)’. 
Hence, from (9), (10) and (12) 


E(YW) + E(UV)}. 


| 
)) = ii 
(13) D(a, b | s) lim Acdb | 


We now reduce the calculation of (13) to finite form by approximating to the 
infinite sets Y, U, V, W by progressively more ample but finite sets. To do so, 
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, 


we cover F’ by progressively more ample but finite networks of points. More 
precisely : consider a rectangular system of axes O£ and O7» oriented as in Figure 1 
so that the axes are common boundaries of & X b = R and of the rectangles ob- 
tained by increasing a and b. Let 

(14) d, = a/(n + 1), 6, = b/(n + 1). 

Consider the lattice of points (77) with coordinates 

(15) f= idx, 03" = fbn 

fori = —v}”, =v,” > i, *+*, 0, 1, 2, ee n; Jj = —v”, —v3” Bg *** 


0, 1, 2, --- , n, where v{” and vs” are the greatest integers such that 


(16) vd, < Aa 
and 
(17) vs"5, < Ab. 


To simplify the writing, the superscripts (m) will henceforth be dropped. 

With every point (7j) we associate a random variable x;; defined as follows. 
If in the course of the trials contemplated none of the rectangles p covers (77), 
then x:; = 1. Otherwise z;; = 0. Further, write 


Yn = dndn Dy Dy tis 


t=0 j=0 


0 n 
vs _ dn bn ps >, Zi; 


i=—v, 7=0 


n 0 
Vi. = dndn Dy Dy tis 


i=0 j=—v2 


0 0 
Wi =dnin >, DS ij. 
i=—v, j=—v2 
Now the boundary of the set E, for a fixed s, consists of one or more polygons 
having a finite total number of sides each of bounded length. It follows that, 
given any e > 0, there exists, for a fixed s, a number N,(s) such that n > N.(s) 
implies that 


(19) iv. ~- ¥1<« 


with similar inequalities relating to U,, V, and W,. Hence it follows imme- 
diately that 


lim E(Y,W,|s) = E(YW\|s), 


no 


lim E(U,V,|s) = E(UV |s). 


no 


(20) 
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The expectations in formula (13) will therefore be obtained as limits of those 
on the left hand sides of (20). We have 


0 0 n n 
(21) E(Y,W,|s)=a@,8, > > B(x 3 3 als), 
, k=0 l=0 


i=—v) j=v2 


; 0 n n 0 
(22) E(U,Va|8) = dads Dy 2, B(x. Dy Dy atl s). 
t=—v, 7=0 k=0 l=—ve 
Hitherto we have made no assumptions concerning the values of Aa and Ab. 
Since these are to tend to zero, we may assume that 


0 < Aa <y-— a/2, 


(23) 
0 < Ab <y — 6/2. 


On this assumption, we shall now compute the expectations of the type 
E(x; ;%%1 | s), of which (21) and (22) are linear combinations. 

Since the variables x;; and 2x; are capable only of the two values unity and 
zero, the expectation of their product is simply the probability that both of them 
are equal to unity, i.e. the probability that both points (27) and (kl) are “‘missed”’ 
by all the s rectangles p falling on R’. This probability may have one of two 
forms. If both 


(24) d,|t — k| <a@ and 6,|j — l| < 8, 
then 





2aB — (a—dn|i —k|)(B — i |j —1| W 
R’ ’ 


(25) E(aijrmu|s) = ‘t “ne 


while otherwise 


(26) E(x,;;2m| 8) = ( _ a | 
in each case, in virtue of the assumption (ii) of Section 1. 

The essential content of equations (24) to (26) is that, once the other variables 
appearing in them are assigned, E(x; ;,; | s) is a function only of the differences 
t— kandj —l. It is this fact which allows us to evaluate the limits of the 
quantities in (21) and (22) in a simple manner, in effect by holding one of the 
two freely variable points (7j), (kl) in a fixed position, say at the origin. Thus, 
let 


0 0 nti n+j 
(27) E(6,|8) =dndn 2, Do 2. & du wut|) 
t=—v, j=—v2 =t l=) 
Owing to the remark just made, the expectation 


n+i n+j n n 
(28) B( 2. 3 > ru |8) = B20 32 als 
-= [=( 


k=i l=j 
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and it follows that 


E(6n|\8) = ( +1) (m+ 108, B( 2 > dX Ce | s) 
=0 


(29) n on 
[v1 + 1) (v2 + 1) dn dy] E> dX E (x00 Xx | «| 


Of the two factors in the square brackets in (29), the first tends to AaAb as n 
tends to infinity, and the second tends to the integral 


a b 
(30) I I f° (&, n) dédn 
where 


2aB — (a — §)(8 — n) 


(31) fn) =1 - R’ 


if both O < § < aandO < n < B, and 
. _ = 2a8 
(32) fle) = (1 - 728) 


otherwise. Thus the computation of the limit of Z(6, | s) is straightforward. 
It remains to show that it differs from that of E(Y,W, | s) in equation (21) by 
an infinitesimal which is of an order higher than the product AaAb. 

Since the variables x; are capable only of the two values unity and zero the 
absolute value of the difference between the brackets in (21) and (27), that is, 
between 


n+t n+j 


(33) Liz ; . te and x; a D Ll, 


cannot be greater than —n(i + 7) < n(vi + v2). It follows that 

(34) | E(Y,W,|s) — E(@,| 8) | < [dndn(vi + 1)(ve + 1)][nd,vid, + nd,v26,]. 
As n tends to infinity, the right hand side of (34) tends to the product 

(35) AaAb[bAa + aAb]; 


whence 


{lim E(6,|s)} = lim x a E(YW | s) 


-{ [ f* (én) dédn. 


at asl! 
(36) 


A very similar procedure will serve to evaluate the limit of E(UV | s)/AaAb. 
Here, we replace the two freely variable points (ij), (kl) by two semi-fixed points, 
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one being restricted to the axis Oé and the other to the axis On. More precisely. 
instead of considering E(U,V. | s) in equation (22) we consider, say, 


(37) E@.|e) = da > : (25 > 2 1) 


i=—v, j= k=i l=—v2 









and it is easy to see that 


(38) 





lim | E(UnVn|s) — E(?,|s) | < b(Aa)’ (Ab), 








so that the quantity (37) may be used in equations (13) and (20) in place of the 
quantity (22). However, since E(x;;1x,|s) depends only on the differences 
t—kandj — l, 


(39) 






and therefore 


(40) E($n|8) = {da(vi + 1)} {4 on 2 B( 26 >, > au °)| 
7= =0 l= —v2 
Further, and in the same way, we may replace the sum in (40), namely 


a) E(w XY awls)= LY w(eu Y awls) 


k=0 l= —ve 





by the simpler sum 


+ B( 2a z fey | s) 


k=0 l=—v2 





(v2 oe 1) dX B( 2 dX Xj | s) 


(42) 







(vo +1 i dX E(xxo 20; | 8). 
It follows that we may replace the limit of E(U,V, | s) as expressed in (22) by 
(43) lim {dn (v1 ae 1) On (v2 on 1)} {ds On a 2D E (xx Xo; | a}, 

n—>co =0 j= 


and this is easily found to be equal to 


; a b 
(44) Aadb l f' (1) dédn, 


where f(, 7) is defined by the formulae (31) and (32). 
Collecting this result with that expressed by (36), and substituting in equation 
(13), we therefore have finally 





(45) Diab|s) = 4 | ° &n)aean, 
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3. The forms of the derivative. Since the function f(£, ) has two different 
forms (31) or (32) depending on the relationships between a, b, a and 8, it will 
be necessary to distinguish four different forms of the derivative (45), and of its 
integral. 

First, for values of a and b for which simultaneously 
(46) a<a and b<&8, 


the integrand in (45) has the form (31) for the whole region of integration. 
Hence the value of D(a, b | s) in the region (46) is given by, say 


_,f (0, _ 28 -@-h)6-nY 
r=4f [C1 RP ) akan 


a B 
= 4/ | g° (t, 7) dtdr, 
a—a “B—b 


2a8 - tr 
 R! 


Next, when a 2 a but b < 8, the integrand in (45) has the form determined 
by (31) only when 


(49) 





(47) 


where 


(48) g(t, 7) =1— 


whereas when 


(50) 


the appropriate form is that determined by (32). Therefore here D(a, b | s) 
has the form, say, 


(51) D, = 4b(a — c(i ~ = + af [, g°(t, r) didr, 


Similarly, for 
(52) axa but b28, 


D(a, b | s) is given by, say, 


aa as _ 208 ; ri . 8 # 
(53) Ds = 4a(b (i =) +4] | g(t, 7) dtdr. 


Finally, in the region in which simultaneously 
(54) aZ2a and b2 68, 
D(a, b| s) has the form, say, 


(55) a a6)(1 a) 44 [ [ g'(t, 7) didr. 
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(56) 














(57) 











(58) 
































(59) 
Ifa 














(60) 





If a 








(61) 














(62) 














There 








(63) 








IV 


lA 


J. 
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4. The second moment of Y. We have now to determine m(a, b | s) for all 
non-negative values of a and b, from the equation 


d°m(a,b|s) _ 
a D(a, b| s). 


The general solution of this equation is 


m(a,b|s) = [ [ D(a, b| s) dadb + A(a) + B(b), 


where A(a) and B(b) are each functions of one variable. These functions are 
determined by the boundary conditions, namely 


m(a, 0|s) = m(0, b| s) = Om(a,0|s) _ dm(0, b/s) = 0, 


da Ob 


which are a consequence of the inequality 0 < Y < ab. It is then easily found 
that the only solution m(a, b | s) satisfying (57) and (58) has the following four 
different forms, depending on the values of a and b. 

If a < aand b < 8B, then 


a b 
m(a,b| 8) = I Di(z, y) dedy = m(a,b|8) (ay). 
0 0 
a and b < £, then 


m(a, b| s) 


a b 
= mia, b|s) + / | D.(x, y) dxdy 


m2(a,b|s) (say). 


a and b 2 £, then 


m(a, b | s) 


a b 
m,(a, B | 8) + I | Ds(sx, y) dedy 


m3(a,b|s) (say). 


Finally, if a = a and b = 8, then 


a B a b 
m(a,b|s) = m(a,6|s) +f [’ Da(e,y) dedy + |” [ Da(z, y) aedy 


> [ [ D(x, y) dedy = m4(a, b|s) (say). 


The procedure used to evaluate the integrals (59) to (62) follows the same 
general pattern, and we shall confine ourselves to outlining it in one case, say (59). 


m(a, b | 8) = I D(x, y) drdy 


a b a B 

-4/ I dedy | [oe 2) dear 
a a b B 

-4/ ax [att av [ot nar}. 
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Integrating the double integral in the braces by parts for y we get, say, 


b 8 8 ' 
I(t) = | dy | g(t, 7) dr = E | g(t, 7) ar| 
0 B-y By 9 


(64) : 
a I yg (t, 8B — y) dy, 


whence, substituting 8 — y = 7 in the last integral, 


B B 
I(t) = b | g(t, 7) dr — | (8 — r)g*(t, r) dr 
(65) - 


B 
on | (r+ b — B)g'(t, 7) dr. 
B—b 


Proceeding now in the same manner with the other double integration in (63), 
we conclude that 


mi(a,b\s) = 1[ dx [. I(t) dt = af (t+ a — a)I(t)dt 

(66) a B 
=41f al @+a-ade+b—-By't7) ar, 

a—a B—b 


where, throughout, g(¢, 7) is defined by (48). 

Formulae for m(a, b | s), ms(a, b | s) and m,(a, b | s) are obtained by a similar 
procedure. They may conveniently be summarized in the following single 
expression. Define a symbol [2x] for any real number x by the equations 

izj=z ff t20 


- > 
oe “= 0 if «<0. 


With this notation, whatever be the relation between a, b, a and 8, we have 


m(a, b | s) -4f [ : (¢+a—a)(r +b — a1 - at =I dtdr 
[a—a] 4 [6b] 
68 
” 2a’ 
R’ 
We now allow s to take all values s = 0, 1, 2, --- with probabilities P, given 
by the generating function (1). Then it follows, from the form of (68), that 


+ {a'[b — 6 + Ufa — al’ — [a — al lb — wl - 


P — 
m(a,b) = 4 aa he (t{+a-—a\(r+b — aw (1 - 28") aud 


R 


[a—a] 


(69) 


+ {ab — 6 + Ula — a} — [a — al lb — By }Y 1 - 28), 


On subtracting from this the square of the first moment of Y, which by (5) 


and (8) is 
ap 
aby ¢ ), 
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e ° 2 - ‘ ° ° 
we obtain the variance oy of Y. But the variance of Y is necessarily equal to the 
variance a; of X. 


5. Particular cases. (i) ¥i(u) = u’. This is the case, considered originally, 
in which the number s of centers of the rectangles p falling within R’ is fixed. 
The explicit evaluation of the variance ox depends in this case on the evaluation 
of the integral 


: , f 2aB tr \* 
(70) |. { (¢+a—a)(r7 +b — B) (( “sf + t| dtdr. 


The evaluation is easy if one expands the binomial under the sign of the integral 
and integrates term by term. Each such integral is a product of two simple 
integrals. 

(ii) W.(w) = e*'“"”, Poisson Case. This is the case where the probabilities 
P, that there are exactly s centers of rectangles p within R’ are given by the 
Poisson Law, P, = (AR’)*e*”’/s!. Substituting the expression of the probability 
generating function into (69), we obtain for this case 


a B 00 8 
) — 4572080 _ -_ ~ (Air) 
(71) m(a, b) = 4e #7 I, (¢{t+ta-—a)(r +b — B) 2d — dtdr 


+ €**ta'[b — pl’ + b'la — al’ — [a — al’ — 8)'}. 

On performing the integration term by term, and contracting the first term 

of the resulting infinite series into the second line of equation (71), we readily 
obtain the result 


= get SAB)" eB 
m(a, b) = 4 X a Gipers 
a on | 
(72) x {e+ 2a — a + la — al(1 -2) | 


b st+l ai ec 

x {( + 2)b —6+ [6 —D] ( - *) \4 corre, 

where [x] continues to have the meaning defined by (67). In virtue of equations 

(7) and (8), however, the last term of the expression (72) is precisely the square 

of the first moment of Y when s is Poisson distributed. Hence, for s Poisson 
distributed, we have the expression for the variance of Y and of X, 





oo = ot = 46°70 = (AaB)" — op 
; 7 = s! (8 + D%Xs + 2) 
s+1 
(73) x {(e + 2)a— a + fa — al(1 -2) } 


x {( + 2)b —6+ [6 —D] (: “ 7 
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(iii) ¥,(u) = vor, Contagious case. This is the case where the prob- 
abilities P, that there are exactly s centers of rectangles p within R’ are given by 
the contagious law of type A with two parameters. The evaluation of the 
second moment of Y is made easy by noticing that the probability generating 
function appropriate to the contagious distribution may be expressed as 
a series in terms of the probability generating function of the Poisson Law 


k 
W3(w) . 7 nA V5(u) 


k=o k! 


k 
-m mM kvR’(u—1) 
=e —e€ ‘ 
k>o k! 


Thus the evaluation of the integral intervening in the formula for the second 
moment of Y is reduced in the present case to that of formula (71). 


(74) 


6. Remarks on other cases. (i) It may be of interest, in amplification of 
H. E. Robbins’ results, to exhibit the analogues of formulas (68), (69) and (73) 
in the one-dimensional case. For this case, then, if the interval a is embedded 
in a larger interval a’, we obtain by similar methods beginning with the calcula- 
dm(a | s) 

da” 


(75) m(al|s) = 2 (t{+ta- a(1 ~— a ‘) dt + [a — al*(1 ~ =) ‘ 


tion of 


[a—a] a 
whence 


a 


(76) m(a) = 2 (¢+a— a(t ~ m=) dt + [a — atv (1 — 2a 


, > 
[a—a] 
in particular, if s is Poisson distributed, 


wo / 8 
° 2 ~2 ad) a 
Ox Cy = 2e = » ; ( 


=i s! (s+ 1)(s + 2) 


x {(e + 2a — a + la — al(1~ 2) ? 


The close parallel between these formulas and those for two dimensions make it 
natural to conjecture analogous formulas for n dimensions; but we have not 
attempted to establish such formulas. 

(ii) For the evaluation of the higher moments of Y it may be useful to notice 
that precisely the same method as that described above leads to the conclusion 
that the derivative of the n-th non central moment of Y is 

d° m,(a, b) 


° ] 1 wn— r n—2 77 
ee * De ee ee ore. 


(77) 


2 J. Neyman, “On a new class of contagious distributions’’, Annals of Math. Stat., 
Vol. 10 (1939), pp. 35-57. 








ON THE MEASURE OF A RANDOM SET. II 


By H. E. Rossins 
Postgraduate School, U. S. Naval Academy 


1. Introduction. In a recent paper’ the author derived general formulas for 
the moments of the measure of any random set X, and applied the formulas to 
find the mean and variance of a random sum of intervals on the line. In a 
subsequent paper’ J. Bronowski and J. Neyman, using other methods, found the 
variance when X is a random sum of rectangles in the plane, and raised the 
question of finding the variance when X is a random sum of n-dimensional 
intervals in n-space. This will be done in the present paper, independently of 
the work of Bronowski and Neyman, using the methods of (I). The correspond- 
ing problem for circles in the plane will also be solved. 


2. n-dimensional intervals, N fixed. Let the random set X be defined as 
follows. Let A;, a; (the range of the subscript 7 throughout this paper will be 
from 1 to n) and 6 be fixed positive numbers such that a; < 25. Let R denote the 
n-dimensional interval consisting of all points (1, --- , 2) such that 0 < a2; < 
A; , and let R’ denote the larger interval for which — 6 < x; < A; + 6 (and also 
its measure II(A; + 2 6)). Let a fixed number N of intervals with sides a; 
parallel to the axes be chosen independently, with the probability density func- 
tion for the center of each interval constant and equal to 1/R’ in R’. The set X 
is the intersection of the set-theoretical sum of the N intervals with R. The set 
Y consists of those points of R that do not belong to X. We have identically 


(1) X+ Y=R, 


where capital letters denote either sets or their measures. 
From (I), equation (15), we have 


An Aj 
(2) EY) = [  [ p(t, ***, tn)dt, +++ Ate, 
0 0 
where, setting r = Ila; , we have 
» 
(3) pla +++ 25) = Pri(er, + aee¥) = (1 - ) - 
Hence 
, N 
(4) E(Y) = r( _ 4 ; 


1H. E. Ropsins. ‘“‘On the measure of a random set,’’ Annals of Math. Stat. Vol. 15 
(1944), pp. 70-74. We shall refer to this paper as (I). 

2 J. BRONowsKI AND J. NerMaNn. ‘“‘On the variance of a random set.’ Annals of 
Math. Stat. Vol. 16 (1945), pp. 330-341. We shall refer to this paper as (BN). 
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From (1) it follows that 


(5) E(X) = RI _ ( _ #)} 


From (I), equation (21), we have 


; die Aj An Al 
oe Le [pe eee 


“dz eee dx, dy eae dyn 5 
where 
(7) p(x , oe '* ote Big °° » Yn) _ Pr((a1 , a , tn)eY and (y:, ax » YneY). 


It is clear from the symmetry of the problem that the distribution of Y will be 
unchanged if we assume that for all 7, x; < y;. Hence, since there are 2” possible 
sets of n inequalities each, we can write 


.™ An Ai fun v1 
(8) Hy?) = 2" | | [| p da, +++ dt, dy: «++ dyn. 


We now introduce the new variables of integration ® 
(9) U= Xi, vi = Yi — 2 

for which 

(10) O(Ur, *** y Uny My *** y Un) _ 


0(x1, eee Xn, Y1,5 °°° > Yn) 
In terms of the new variables we have 


N 
( — *) if v; > a; for some, 


N 
(1 — % — He — #)) if v; < a; for alli. 


(11) p = f(t, +++, %) = 


Equation (8) now becomes 


E(Y*) = 2" fr al fr cr. i [Oo sae +++ date +++ Ot 
(12) 


- ar ["-.. [ma — v,) dv +++ don. 


Let z; = min(a;,A;). Then from (11) and (12) we obtain 


E(¥?) = ef [G - X= Ts — 9) Il(Ag — v4) dri +++ dip 


. or N Ag Ai 
(13) +21 - 2) if eee j TI(A; — v4) dy, +++ dup 


- [ eine [na — v;) dv, ++: ios). 
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Let the symbol [x], as in (BN), be defined by 


sas > 6, 
(14) [7] =} | 
Oifx < 0. 


In the integral in the first line of (13) we introduce the new variables of integra- 
tion w; = a; — v;, while in the two integrals in the second line we introduce 
the variables s; = A; — v;. The result is 


an ay an \s 
E(Y’) = 2” _ / ¢ ins ee pet) 
hierdie’ i~aed R 


-II(w; + A; — a;) dw, --- dw, 


+ 2" ¢ — ay ig tee [ IIs; ds, --- ds 
R’ 1 0 0 tia a 
is Ay ¢ 
— tee / IIs, ds, --- ish 
[An—an] (A4—4,] 


~ 9 ” f ¢ di “oy 
[a,—A nl (a,—Ay] R 


-I(w; + A; — a;) dw, --- du, 


(15) 


ba 


From (1) we see that ox = E(X*) — E°(X) = E(Y*) — E*(Y). Thus from (4) 
and (5) we have 


9 é an ai 2; a Il w; N 
Ox = 2 rr. ] — —— - 
[an—A nl (4;—A4] R 


-H(w; + A; — a;) dw, --- dw, 


™ m : , , 
‘ (: ) ina? — 0(4° — (4, ~ af) 


2r\" 2 2 F 
+(1——) \mAj — (Aj — [A; — a,J’)}. 


eu 


. r 2N 
-e(s-3)" 


3. n-dimensional intervals, N variable. Nowlet X and Y be defined as before 
except that the number N is taken as a random variable, capable of assuming the 
values 0, 1, --- with respective probabilities po , pi, --- , and with generating 
function 


(17) g(t) = >) prt’. 
0 
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Then from (5) we have 


— ae 4» oe a 
wy m= Emal—-(1-Z)} -afr- og. 


while from (15) we have 





c= HY) -E(Y)=2 f°... | (1 - pe) 
ba) [a,;—A,] R’ 
(19) M(w; + A; — a,) dn, «++ dwn 
2r 2 2 2 2 2 r 
ay) @- ) eee 
+e(1-Z) (43 = 1A, — a} — Ree (1 - Z) 


In particular, suppose that, as in (BN), N has a Poisson distribution with a 
parameter A, 





(20) pyr=e™- =. 

so that . 
(21) of) = eh”, 

Then (18) becomes 

(22) E(X) = R{1- &™}, 


while (19) becomes 


an a, eo N 
é —2Ar NIlw; 
ox = 2" -¢ . i cee / > 0 e 
[a,—A,] [a;—A,] 0 N! 


{II (w; + A; -_ a;)} dw, oes dw, 
+e™ {mA} — (Aj — [A; — a)*)} - Re. 





(23) 


Integrating term by term and simplifying the resulting expression, we obtain 
finally 


, . ee rr)* 
henna e™ Sd (Ar) 
1 


N1{(N + 1)(N + 2)}" 


A; N+1 
: Ty + 2)A; —a; + la; —A il ( = As) MW. 


4. Circles in the plane. Let the random set X be defined as follows. Let 
A,, Az, a, and 6 be fixed positive numbers such that 2a < min (A;, Ae, 26). 
Let R denote the rectangle consisting of all points (a , x2) such that 0 < 4 < Aj, 
0 < a < Ao, and let R’ denote the larger rectangle for which — 6 < x, < Ai + 6, 
—6 <a.< A.+ 6. Let a fixed number N of circles with radii a and areas 
b = za’ be chosen independently, with the probability density function for 





(24) 
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the center of each circle constant and equal to 1/R’ in R’. The set X is the 
intersection of the set-theoretical sum of the N circles with R. The set Y con- 
sists of those points of R that do not belong to X. Equation (1) holds as before. 
The analogue of (4) is 


(25) E(Y) = [ l ee (1 a i) 


while (8) becomes 

° A2 fA py2 Py 
a6) wr) = 4 plea, a, us) devden dys dys, 
where 
(27) P(t, X,Y, Y2) = Pr((a, r)eY and (M1, y2)e¥). 
Introducing the new variables (9) we obtain the analogue of (12), 

Ag Aj 

(28) E(¥) =4 I | fide — MA, — 0) dn, 

0 Jo 


where, setting r = (v; + v2)’, 


N 
¢ - >) ifr > 2a, 
N 


acacia r —=——5 
|, _ 2 — 2 areeos (7°) + 7 V/4a “ ae <i. 
Introducing polar coérdinates r, @ in the v; , v2-plane and carrying out the obvious 
integrations, we obtain 


N 
E(Y*) = (1 =) {R +24 (Ai + Az) — 80° — sore} 


(29) fi, %) = 








~ R a 

1 

(30) + 8a’ | (Rt + 4a° t® — 4a(A, + A,)t”) 
- 0 
( 2b — 2a° arecos t + 2a” tr/ 1— Fy" 
f3— ee dt. 
R 
If now N is a random variable with generating function (17), then (25) becomes 
wer b 

(31) E(Y) = Re(1 5). 


and hence 


(32) E(X) = RI “ e(1 “ 2). 


R’)) 
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. while 
. ot = E(X*) — E*(X) = E(Y’) — E*(Y) 
= e(1 —_ mR + s a'(As ot As) = 8a‘ = sor 
(33) 


1 
_ RY ( “ z) + 8a? I (Rt + 40°? — 4a(A; + A:)¢*) 


2b — 2a” arccos t + 2a’ t+/1 — # 
gy pe re dt. 











SAMPLING FROM A CHANGING POPULATION": * 


By REINHOLD BAER 


University of Illinois 


1. Introduction. If, in sampling a certain population, it is impossible to take 
more than one sample at any given time, and if the population changes between 
any two samples, then we are confronted with the following mathematical situa- 
tion. For every’ t, 0 < ¢ < 1, there is given a distribution’ (= population) 
D(t). Let furthermore ¢; be, for 0 < 7 < n, a number between (j7 — 1)/n and 
j/n; and assume that x; is a sample taken from the population D(t;). We denote 
by T, the set of the numbers ¢,, --- , t, and by O(T,,) the sample consisting of 
the x; ; and we assume that O(T,,) is a random sample, i.e. that 2, ---, 2, are 
independent variables. The question arises to get information concerning the 
family D(t) from the sample O(T,,). It is clearly hopeless to try for information 
concerning an individual D(t) or even some D(t;) or the statistics that may be 
derived from them. But we may hope for information in the mean, if we assume 
that the family D(t) is in some sense continuous in ¢. To make this statement 
more precise we denote by a(t) the average and by M(t) the 7-th moment of 
D(t) around its average. We assume then that a(t) and M(t), fori < 8, exist 
and are continuous functions of ¢, and in section 7 we shall have to assume 
furthermore that a(t) and M,(t) are functions of bounded variation. These 
hypotheses assure the existence of 


1 
the mean average a = | a (t) dt 
0 


1 
and the mean 7-th moment M; = | M; (t) dt 
0 


fori < 8. Clearly we may hope for information concerning a and M; from the 
random sample O(7,,). It is our object to discuss certain more or less well 
known statistics of the sample O(T,,), and to determine their stochastic limits’. 


1 Presented to the American Mathematical Society. September 15, 1945. 

2 The author is indebted to Dr. E. L. Welker for checking the results, in particular those 
rather obnoxious computations needed in sections 6 and 7 which the author did not incor- 
porate into this paper. 

3 It constitutes a restriction of generality that we consider finite closed intervals only. 
But it is no further loss in generality to use the interval from 0 to 1, and this choice certainly 
simplifies notations. 

4 Comparatively little will be assumed of these distributions. These properties will 
be enumerated in Section 2. : 

5 See [2] p. 81 and the criterion 2.d. of section 2. 
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As an illustration we mention the following results which will be obtained in the 
course of this investigation (among others) :° 


n 
% =n >, x; converges stochastically to the mean average a; 
j=l 


n 


1 
gon 7 (x; — &)* converges stochastically to Mz + / (a(t) — a)’ dt; 
0 


j=l 


n—1 
d? = (2n)" } (a; — 2;41)” converges stochastically to the mean variance M; . 
j=1 
It is clear that M2 is the stochastic limit of s’ if, and only if, a(t) is constant. 
If a(é) is not constant, then s’ is not a consistent estimate’ of M, , and will have 
to be rejected—at least for large n—in favor of d’ which is always a consistent 
estimate of M2 . 

It was this last point that led us into this investigation. Recently the sta- 
tistic d’ has found much attention; and the question arose as to why the statistic 
s’ should be rejected in favor of d’. Reading the illuminating introduction of the 
fundamental paper [1], one sees that just such a situation as we have attempted 
to describe here in somewhat abstract terms has necessitated the use of d’. 
Consequently our result may be considered a theoretical justification for this 
procedure. 

Our other results will be discussed in their interrelation as they are obtained. 
It should be noted that all our results concern themselves with stochastic con- 
vergence, and thus they justify the use of a sample function as an estimate of 
some statistical number only for sufficiently large size n of the sample. Thus 
it is quite possible that for small other functions provide better estimates. 
The practical applicability of our results depends, therefore, on a criterion for n 
to be sufficiently large, and unfortunately such a criterion is not yet available. 


2. Notations and fundamental properties. We have not stated in the Intro- 
duction the hypotheses to which we subject the distributions under considera- 
tion. For our investigation we shall need only very few properties of distribu- 
tions. Thus we are going to enumerate now some properties of distributions 
which we are going to use, and we shall assume throughout that these properties 
are satisfied. As will be seen these hypotheses are rather weak and are satisfied 
by a large class of distributions. 

If x is any stochastic variable, then we denote by E(x) its mathematical ex- 
pectation, and the only properties of stochastic variables that concern us are 
properties of their expectations. E(x) is a linear operation satisfying E(1) = 1. 





6 It should be noted that the stochastic limit of the following statistics would not be 
changed, if we substituted for the denominator n of s? the denominator n — 1 which is often 
used, and if we allowed the summation in the expression for d? to range from 1 to n, defining 
En41 BS XM. 


7 Wilks [2], p. 133. 
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If furthermore 2, --- , 2, are independent variables, and if the function f 
depends on some of these variables whereas g depends only on the others, then 
E(fg) = E(f)E(g), and this property may serve as a definition of independence. 

As stated in the Introduction we are going to study a family D(t) of distribu- 
tions, for 0 < ¢ < 1. If 2 is the stochastic variable of the distribution D(#) 
for some fixed ¢, then we let 


a(t) = E(x) and M(t) = E((x — a(t))'). 


We shall assume throughout that the average a(t) and the variance M2(t) exist 
for every ¢, and that a(t) and M,(t) are continuous functions of t. Moreover, 
when discussing M,(7), 1 < 7 < 4, we shall assume that every M ,(r) withj < 
2: is a continuous function of 7. Thus we are sure that the mean average a 
and the mean variance M,2, as defined in the Introduction, always exist, and 
the mean 7-th moment M; exists, whenever M(t) is a continuous function of t. 

Remark: If the mean i-th moment M; exists for every i, then one may be 
tempted to consider as the mean of the family D(¢) a distribution D with average 
a and i-th moment M; , provided such a distribution exists. But this has to be 
done with some caution. For suppose that every D(t) isnormal. Then M;(t) = 
0 for every odd 7, implying M; = 0 for odd 7 so that D would be symmetric. 
But M2,(t) = 1-3--- (2i — 1)M.(t)* and hence M2; = 1-3--- (2i — 1)- 


1 . 
[ M(t)‘ dt, and the integral will be the 7-th power of Mz only if M;(t) is con- 
0 


stant. Thus the mean distribution D of a continuous family of normal distribu- 
tions need not be normal. 

As in the Introduction we now let ¢; be some number between (7 — 1)/n and 
i/n, and denote by x; a sample taken from the distribution D(t;). We denote 
by T,, the set of the n numbers ¢; and by O(T,,) the sample consisting of the x; . 
It will be assumed throughout that O(T,) is a random sample, i.e. we shall 
assume that 2, --- , x, are independent variables. 

We are not going to make any use of the customary definition of stochastic 
convergence’ (and we shall therefore not restate it). Instead we are going to 
apply throughout the following criterion” ™: 

2.d. The function f(O(T,)) of the sample O(T,,) converges stochastically to the 
number r, if 


lim E(f(O(T.))) = 1 and lim E((f(O(Tn)) — E(f(O(T.)))F) = 0. 


All the sample functions considered will be polynomials of the variables 
%%1 eee, Bini 


8 Wilks [2], p. 81. 

® Wilks [2], Theorem (A), p. 134. 

10 The validity of criterion 2.d. implies stochastic convergence in the customary sense. 
Thus, all results obtained in the present paper remain valid also when the customary defini- 
tion of stochastic convergence is adopted. 
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3. The mean average. ‘Though the discussion of this section is rather obvious, 
we give the details, since they may serve as a convenient introduction to the type 
of argument we have to use throughout. 

THEOREM. Z converges stochastically to a. 

n 


Proor: We note first that E(#) = n™' = E(«;) = nn" >» a(t;). Since ¢; is 
j=l j=1 


between (j — 1)/n and j/n, and since n° is the length of this interval, it follows 
from the continuity of a(t) that 
1 n 
[ a@at = tim x Y a); 
0 no 7=1 
and thus we have shown that E(%) tends to a as n tends to infinity. 
Next we find that 


Be - B@)y) = »* B([ Ow - aw» f) 
= nF E(x — a4))') = 1 * YD Ma), 


since E((x; — a(t;))(a, — a(ts))) = E(a; — a(t))E(a, — a(t,)) = Oforj # h. 
But M,(t) is, for 0 < ¢ < 1, a bounded non-negative function, showing that 
E(( — E(#))’) tends to 0 as n tends to infinity. Applying 2.d. we find that 
= converges stochastically to a, as we intended to show. 

Remark: It is clear that the speed of the stochastic convergence of to a de- 
pends on two factors: 
(i) the goodness of < as an estimate of H(z); 


(ii) the speed of convergence of the sums n> a(t;) to the integral a = 


l “a(t dt. . 


It is this difficulty which expresses itself in (ii) and which makes the present 
type of statistical estimation less effective than the one concerned with sampling 
from one distribution only. As to (i), it is again, as may be seen from the proof, 
of the order of magnitude (M./n)*, (see Theorem 1, section 4). 

It is probable that Z is a better estimate of E(Z) than of a. But this does not 
help, since the former depends on the particular choice of T, . 


4. The variance. TurorEM 1. d converges stochastically to M2. 
Proor: We note first that 
E((z; — X j41)” = E([(x; — a(t;)) + (a(t;) — altjn)) + (altjas) — 2 j+1)]°) 
= M,(t;) + (a(t;) — a(tj41))” + Ma(tjs), 


since E((x; — a(t;))(tj41 — a(tju))) = ECs — alt;))E@i1 — altjn)) = 9, 
E(const) = const and E((x; — a(t;))”) = M2(t;). Hence 


E(d’) = (2n) "(A + B — OC), 
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where A = 2 > M.(t;), B = ¥ (a(t;) — a(tj4))”, C = Mo(t)) + Mo(t,). Since Oe 
t;is a value niet Gj - 1)/n aa j/n, and since n ~ is the length of this interval, 

it follows from the continuity of the function M,(t) that M, = [ 0 dt = A 
lim (2n)"'A. Since M(t) is bounded as a continuous function, it follows that n 
(2n)C tends to 0 as n tends to infinity. Finally we infer from the continuity . 
of a(t)—which is used here for the first time to its full extent—that there exists n 


to every given positive e« an integer N = N(e) such that (a(t’) — a(t’’))? < ¢ 
for | t’ — v’’| < (2N)". Thus for N(e) < n we have (a(t;) — a(tj41))” < € and 
(2n)"B<n = e. Hence (2n)'B tends to0 as n tends to infinity, and we have 
shown that 

E(d’) tends to M, as n tends to infinity. 

Next we note that 
E((@’ — E(@’))) = Ed’) — E(@’y’ 

= (2n)~ Z [E((ei — igs) (ay — ty1)) — E((@i — tigs))E((a; — 2541)”)]. 
4) 


But if both i and i + 1 are different from j andj + 1, then E((x; — 2xi4)°(x; — 
tjai)) = E((a; — xis)*)E((a; — 2)4:)"), and thus there are not more than 3n 
summands in the above summation that are not identically 0. These sum- 
mands, however, depend only on a(t,), Mo(t,), Ms(t,) and M,(t,), and they are 
therefore bounded. Thus E((d’? — E(d’))*) is equal to (2n)~” times a sum of 
not more than 3n summands which are bounded. Hence E((d’ — E(d’))*) tends 
to 0, as n tends to infinity. Now our theorem is an immediate consequence of 
the criterion 2.d. 


1 
THEOREM 2. s° converges stochastically to Mz + | (a(t) — a)’ dt. 
. 0 


n 


Proor: We note first that n(x; — %) = >» (x; — x,) and that therefore 


hal 
s = n> » (x; — a)(@; — ay). Since x; — x; = x; — a(t;) + alt) — alt, — 
(x4; — a(t,)) e find as usual that 
E((e; — a)) = Ma(t;) + (a(t) — a(t)” + Molo), 
and if h # k we find that 
E((x; — 2n)(xj; — xx)) = Mi(t;) + (a(t;) — a(tr))(a(t;) — a(t)). 


Consequently 


2» E((x; — xn)(a; — ax)) = n?M2(t;) + a M2( th) 





+ 2) (at) — a(ts))(a(t)) — a(te)) 






n n 2 
n’ M2(t;) + >> M(t.) + bP (a(t;) — a(u)) | : 
h=1 h=1 





Since 


erval, 
} t oe 
that 


luity Ff 


Xists 


and | 


lave 
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Consequently 


E(s’) = n™ > M.(t;) + n~ : M2(t.) + n° : b (a(t;) — a(t) | 


2 


As in the proof of Theorem 1 we sec that the first of these sums tends to M2 as 
n tends to infinity, and the second of these sums therefore tends to 0 as n tends 
to infinity. The last sum equals 


n 2, [atts — a(t;)(a(tr) + a(t,)) + a(t,)a(ts)] 


n > a(t;)” — 2n” 2, a(t;)a(t,) +n” 2 a(tn)a(te) 


nn > a(t;)” — ja > a(t) | 


2 
1 . 1 2 
and this expression tends to I a(t)” dt — | | a(t) av| as n tends to infinity. 
0 0 


But 
| a(t)? dt — | f a(t) at| - | (a(t) — a)? dt, 


1 
since a@ = [ a(t) dt, and thus we have shown that E(s’) tends to 
0 


M, + [ (a(t) — a)’ dt as n tends to infinity. 
i 4, h, k, p, q, r are integers between 1 and n, we put 
Gj, h, k; p,q, 7) = E((xj — an)(xj — Xx)(Xp — %q)(Xp — 2r)) 
— E((xj — %)(xj — ax))E((@p — q)(%p — 2,)). 
If neither 7, h nor k is equal to any of the three integers p, q, r, it follows from the 
independence of the variables x; that (j, h, k; p,q, r) = 0. Thus 
E((s' — E(s*))’) = E(s') — E(s’)’ = n° 3'(j,h, k; p, 4, 7), 


where the summation is taken over all the values of j, h, k, p, q, r between 1 and 
n with the restriction that at least one of the three numbers j, h, k is equal to at 
least one of the three numbers p,q, r. This sum contains therefore not more than 
3°n° summands, and each of the summands is bounded, since they depend only on 
a(t:), Mo(t;), Ms(t;) and M,(t;). Thus E((s’ — E(s’))*) is equal to n° times a 
sum of not more than 3°n° summands which are bounded. Hence E((s’ — 
E(s*))’) tends to 0 as n tends to infinity. Now our theorem is an immediate 
consequence of the criterion 2.d. 


1 
Noting that [ (a(t) — a)’ dt is nothing but the variance of the function 
0 


a(t) (around its mean a), we obtain the following obvious consequence of Theo- 
rems 1 and 2. 
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Corotiary: s' — d° converges stochastically to the variance of a(t). 
Remarks similar to those made in connection with the proof of the theorem of 
section 3 may be made now in regard to the theorems of this section. 


n—l 


By similar arguments it is possible to prove that the statistic n™ > ras, 
t=] 


1 
converges stochastically to [ a(t)” dt. 
0 


5. The third moment. Put d(3) = n™ ,m (x; — ayy) (thx — Zj42). Then 
7=1 


d(3) is a function of the random sample O(T',). 
THEOREM 1: d(3) converges stochastically to M3. 
Proor: It is readily seen that 


E(x; — aj41)(tiar — 542)) = Maltin) + (a(tj4:) — a(tj42))(Ma(t,) 
+ (a(t;) — (a(t) — alts)” + Ma(ts1)), 


and in practically the same fashion as in the proof of Theorem 1 of section 4 one 
shows now that E(d (3)) tends to M; as n tends to infinity. 
Furthermore we have 


E((d(3) — E(d(3))’) = E(d(8)*) — E(d(8))’ = n™ a (j, h), 
d» 

where 
(j,k) = E(x; — 2541)(a jar — 2 542)(tn — ta41)(tngr — ta42)) 

— E((z; - ii) (2 j44 — Xj42))E((a, — Tri) (Tr41 — Xn+2)). 
Clearly (7, h) = 0 whenever 7 + 2 < horh + 2 < j. Consequently there 
appear actually in the sum of all the (j, h) not more than 5n terms each of which 
is bounded by an absolute constant, since they depend only on a(t;), M2(t;), 
M3(t:), Ma,(t;), Ms(t;) and M,(t;). From this fact we infer as before that 
E((d(3) — E(d(3))’) tends to 0, as n tends to infinity, and our theorem is an 


immediate consequence of the criterion 2.d. 
Remark 1. If M3(t), M2(t) and a(¢) are constant, it follows from the proof that 


n—2 
M;; 
nN 





E(d)) = 


and thus (n — 2)" 7. (x; — %j41)'(@j41 — 2j42) is an unbiased estimate of M3. 
7=1 
Remark 2. One might be tempted to use instead of d(3) the following function: 
n—l 
nn” dL (x; — t41). 
[a 


By an argument of a nature rather similar to the one used in the preceding proof 
one may show, however, that this statistic converges stochastically to 0. 
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Put s(3) = n' z (x; — #)°. Then s(3) is a function of the random sample 
j=1 


O(T,.). Furthermore let 
1 1 1 
— 3( I ee a I a(t) it) 4+ Qa? + | a(t) dt. 
0 0 0 


THEOREM 2. (3) converges stochastically to M3; + F3. 


Proor: For fixed j, let X(j) = >> (a; — a(t;) + a(ts) — ax) and A(j) = 
h=1 


> (a(t;) — a(t,)). Then 
h=1 


E(a(3)) = n* DEX) + AY) 
= nt DEKH) + BADE) +A 
since E(X(j)) is easily seen to be 0. We find furthermore that 
E(X(j)°) = (n — 1)° M3(t;) + BD (a(ts) — za)I*) 
= ((m — 1) + Matt) — DY Malta; 
E(X(j)*) = (n — 1)’ M,(t;) + ED (a(th) — an)*) 


= ((n — 1)? — 1)Malt;) + > Miltr). 


Consequently 
E(s(3)) = nf ((n —1)°—n+1) > Malt) + 3((n — 1)° — vy A(j)M3(t,) 
+ 3D A(j) a M2(tr) + 2d AG | ' 
Since furthermore . A(j) = a (a(t;) — a(t,)) = 0, 
> A(j)M2(t;) =n > a(t;)M2(t;) — a a(tn) a M.2(t;) 
and 


XAG = B[ nay) — Fats) | 


]7= 


= nD a(t) — 3n? a(t) . a(ts) + 3n : a(t;) [> a(t) | 
-* P a(t) 


3 
’ 
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it is easily verified that E(s(3)) tends to M3; + F3, as n tends to infinity. 

To prove that E((s(3) — E(s(3)))*) tends to 0 as n tends to infinity, one 
proceeds as in the proofs of the preceding theorems, namely by verifying that 
this expectation is n* times a sum of not more than 4‘n’ summands which are 
bounded, since they depend only on a(t;) and on the M,,(t;) for 1 < m < 7. 
The proof of the theorem may then be completed by applying the criterion 2.d 

It is readily seen that F3 vanishes whenever a(t) is constant. But from 


1 1 
F,=3 | f a(t)M.2(t) dt — ah | + [ (a(t) — a) dt 
0 0 


we infer that F's vanishes too whenever M,(t) is constant and a(t) is at the same 
time symmetric with regard to a, and more precisely: if M.(t) is constant, a 
necessary and sufficient condition for the vanishing of F is the vanishing of the 
third moment of the function a(t) around its mean. Thus we see that d(3) 
is always a consistent statistic for M3 , though s(3) is not. 


6. The fourth moment. The results in this section will be stated without 
proof. Their proofs can be constructed on exactly the same lines as the proofs 
in sections 4 and 5. 


n—1 n—1l 
~1 4 - \3 
(2n) a (xj; — 2541), i (tj-1 — 2;) (tj41 — 25) 
i= j=2 


and 


n—l 
-1 2 \2 
n » (tj-1 — 23)" (2j41 — 25) 


)=2 


1 
converge stochastically to M, + 3 | M a(t)” dt. 
0 


n—1 2 
(4n) |= (1; — rs | converges stochastically to M, + M3. 
j=l 


n—2 1 
(4n)* 2 (aj-1 — 2;)'(aj41 — a;42)” converges stochastically to [ M,(t)° dt. 
fat 0 


I1=s 


From these facts one easily deduces that M, is the stochastic limit of 


afi , 3S + 
n | 3 Dy (tj — tin) — |B DY (ia — 24)" — zs), 
a= 4m 


: ; 
and thai [ (M.(t) — M’)? dt is the stochastic limit of 
0 


n—1 n—1 z n—2 
(any © (x; — tj)* — _— w= rh _ » (xj-1 — 2;) (ti — as) | 


j=2 


7. Efficiency. If f = f(O(T.)) is a function of the random sample O(T,), 
and if f converges stochastically to a number 7, then 


lim nE((f — r)’) 


no 








n)s 
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may be considered as some sort of a measure for the efficiency'" ” of the statistic 
f as an estimate of r, provided, of course, the limit exists. 
THEOREM 1. [/f the function a(t) is of bounded variation, then 
lim nE((# — a)’) = M,. 


no 


Proor: Clearly 


n 2 
nE((é — a)’) = mE (x; — |) 
=n > M,(t;) + n bP (a(t;) — a)| . 
j=l 7=1 


n n n i/n 
Now >, (a(t;) — a) = >> a(t;) — na = >, Ee _ n | acca | 
7=1 7=1 7= (ij-1)/n 


Since a(¢) is a continuous function, there exists a number uw; such that 


i/n 
(j —1)/n Su; Sj/n, and / a(t)dt = n" a(u;). 
( 


j-1)/n 


Thus 


n 


LX (a(t;) — a) = 2 (a(t) — a(w,)). 
i= i= 

But both ¢; and u; are between (7 — 1)/n and j/n, and a(t) is of bounded varia- 
tion. Hence there exists a constant A which depends on a(t) only and not on n 
or 7’, such that 


n 2 


[> (a(t;) — «| < A for every choice of 7. 
j=1 
The contention of our theorem is a fairly immediate consequence of these facts. 

This theorem and its proof may serve as an additional substantiation of the 
remarks appended to section 3. 

Remark: If we had assumed only the continuity of a(t) instead of its being 
of bounded variation, we could have tried to argue as follows: Since a(¢) is con- 
tinuous, there exists to every positive number ¢ an integer N(e) such that | a(t’) — 
a(t’) | < efor |t’ — t’| < N(e)'. Hence we would find that for N(e) < n 
we have 


n 2 
n | (x; — «| < ne; 
j=l 


and this inequality is certainly insufficient for proving that the left side of the 
inequality tends to 0 as n tends to infinity. 

THEOREM 2: If the functions a(t) and M,(t) are both of bounded variation, then 
lim nE((d> — M2)*) = Mg. 


no 


1 Wilks [2], p. 134/135. 
12 or a measure for the asymptotic variance of the function f. 
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Proor: In the course of the proof of Theorem 1 of section 4 we have shown 
that E(d?) = (2n) ‘(A + B — C), where 


n n—l 
A=2 2d M2 (t;),B = 2 (a(t;) — a(tjs1))*,C = Mo(t:) + Mo(tn). 


Since M,(t) is bounded, it is clear that n *C tends to 0 as n tends to infinity. 
Since a(t) is of bounded variation, there exists a constant B* such that B < B* 
for every choice of 7’, , and hence n 'B tends to 0 as n tends to infinity.” Fur- 
thermore we have 


> M2(t;) — nM, > | aa —n / oa(oat| 
7=1 ( 


7=1 j—1)/n 
Because of the continuity of M.(t) there exist numbers v; such that 
i/n 
G-De te <i af Mie) =o / M,(t)dt. 
(j—-1)/n 
Consequently 


2d M,(t;) — nM, = dX [M.(t;) — M2(v,)}. 
z= = 
But M,(t) is a function of bounded variation, and thus we may infer, as in the 
proof of Theorem 1, that n*{(2n)~'4 — M,] tends to 0 as n tends to infinity. 
Combining all the facts we see that n'{E(d?) — M,] tends to 0 as n tends to in- 
finity, and hence we have shown that n[{E(d’) — M.J]’ tends to 0, as n tends to 
infinity. 

As in the proof of Theorem 1 of section 4 we note next that 


E(d‘) — E(d’)’ = (2n)*> (3,9), 


where (i,j) = E((ai — ins)(aj — aj41)) — E((xi — vig) E(25 — 2541)", 
and that (7, 7) = 0, if eitheri + 1 < jorj +1 <i. Next we observe that 
(i, j) = E((xi — alti) + a(tigs) — rigs)*(aj — alts) + altjus) — a j41)°) 
— E((x; — alts) + a(tizs) — vigs)?)E((ajy — alts) + a(tizs) — 2541)’) 
+ (a(ti) — a(tiszs))(2, 9)’ + (alts) — atin), 9)"; 
where the expressions (7, 7)’ and (i, 7)’ are bounded (by a number independent 
of 7, 7, n or T). 
Consequently we have 
(i, i) = Malt) + 6Mo(ti)Mo(tiss) + Maltigs) — (Mo(ti) + Mo(tis))? 


+ (a(ti) — a(tizr)) (i, 2)* 


Mi(ts) + Ma(tizy) + Melts)? + Mo(tiss)” 


— 2(M2(t;) — M2(ti41))” + (a(ti) — a(tiz:))(2, 2)*, 
where (7, 7)* = (7, 7)’ + (7, 7)” is bounded by a bound independent of 2, n, T,. 





13 A remark similar to the one made just before stating Theorem 2 may be made here and 
below about the indispensability of the hypothesis that a(t) and M2(t) be of bounded varia- 
tion. 
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Likewise we find that 
(i,¢ + 1) = Melts) Mo(tiss) + Mo(t;) Mo(tize) + Ma(tins) + Mo(tis1) Mo(ti+e) 
— (Malt;) + Mo(ti+1)) Melti+1) + Me(ti+2)) 
+ (alti) — alten) @, + 1) + (@(tinn) — a(tise)) Gi +1)” 
= Ma(ti+s) aa M2(ti41)” 
+ (a(t:) — a(ti4)) @,¢ + 1)" + ins) — a(tis2)) @ ¢ + 1)” 


Hence 
(i,t) + 27,7 + 1) = Malt;) + 3Ma(tsas) + (M,(t;) 

— Moltis1)) (83Mo(tia:) — Mo(t)) + (alts) 

— a(ts41)) @, i)* + (a(ti+1) 

— a(ts42)) (i,t + 1)”, 
where (i, 1)* = (i, i)’ + (i, i)” + (i, i + 1)’ is bounded by a bound independent 
of i, n, T. Considering that 

n—1 n—2 
Xj) =D G)+22 Git, 


it is now deduced from the continuity of the functions a(t), M2(t) and M,(t) that 
n{E(d*) — E(d’)*] tends to My, as n tends to infinity. We note finally that 
E((d’ — M.)*) = E((d° — E(d’))’) + (E(d’) — M.)’, and the theorem is an im- 
mediate consequence of the facts we have deduced. 

THEOREM 3. If the functions a(t) and M,(t) are both of bounded variation, then 


lim nE((s’ — Mz — (a(t) — a)’ dt)) 


no ‘ 0 . . 
= M, — [ M(t)’ dt + 4 | (a(t) M3(t) — aM3) dt + 4 | M(t) (a(t) — a)* dt. 
0 0 0 


Proor. Since a(t) and M,(t) are of bounded variation, we show—as in the 
proofs of the two preceding theorems—that 


n n 1 
n(n” 2. a(t;) — a),ni(n™ > a(t;)> — | a(t)’dt), and 
j=1 j=1 0 
n'(n™ 2 M,(t;) — Me) 


all tend to 0, as n tends to infinity. In the proof of Theorem 2 of section 4 we 
computed E(s’). Using this result we obtain: 


1 
n'(E(s?) — m. | (a(t) — a) dt) 
0 
= niin DY M2(t;) — Ms) + nin DY Mi(t;) 
j=l j=1 


n 1 
+ n(n" > alt; — | a(t) dt) 
j=l 0 


+ (ce — [o : ait) |) 


— 7? 
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1 1 
where one should remember the identity [ (a(t) — a)’ dt = [ a(t)’ dt — a’. 
0 0 


But 
n} (« _ ja . a(t) |) = 7} (« —n° > a(t) )(a +n" . a(t), 


where the last factor on the right is bounded by a bound independent of n and 
T,. Hence it follows that 


1 2 
n( 20) — M, — | (a(t) — a) it) tends to 0, as n tends to infinity. 
0 
By a computation of great length and little interest one shows that 


nE((s* — E(s’))”) = n* | on — 1) > Mi(t;) + 4n(n — 1) > M3(t;)a(t;) 


— 4(n _ 1) z. M3(t;) » a(t) +b 2| 3 matt) | 
h=1 7=1 


j=1 


— (n’ — 2n + 3) 2) Milt)” + 4n® Do Ma(t)a(t) 
= _ 


n 


— 8n >> alt;) >> a(t,)Mo(t,) 
j=1 


h=1 


n 2 8 
+4 > a(t) | matt) J. 
j=1 1 


h= 


It is readily seen that this expression tends to 


1 1 1 
M,+4 [ M;(t)a(t) dt — 4M;a — | M,(t) dt + 4 [ M,(t)a(t) dt 
0 0 0 


1 
~ * | a(t)M>(t) dt + 4a? Me, 
0 


and now it is clear how to complete the proof of our theorem. 
Corotuary 1. Jf a(t) is constant and M(t) of bounded variation, then 
1 
in ee? ~ Bt) = M, + [ M.(t)? dt. 
n—e0 0 

This is an almost immediate consequence of Theorem 3, since a(f) = a, if 
a(t) is constant. 

It has been shown in section 4 that d° is always a consistent estimate of Mz 
whereas s’ is a consistent estimate of M; if, and only if, a(t) is constant. Theo- 
rem 1 and Corollary 1 offer a basis for comparing the efficiency of these two 
statistics. Since 


0 < M.(t)? < M,(2) for every ¢ 
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(apart from trivial exceptions), we infer from Theorem 1 and Corollary 1 the 
following fact. 
CoroLuary 2. If a(t) is constant and M,(t) of bounded variation, then 


. 2 
on = Dy [me . 
noe E(@ — M2)) mM.’ 


and this expression is always positive and smaller than 1. 

Thus we may say roughly that for large n the estimate s° of M. is more efficient 
than the estimate d’, in case both may be used. We do, however, not offer 
any information of the necessary size of n. Neither do we claim that for small 
n it might not happen that d° gives a good estimate and s’ a poor one. 
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TESTING THE HOMOGENEITY OF POISSON FREQUENCIES 
By Paut G. Hoe. 


University of California at Los Angeles 


1. Introduction. ‘The standard procedure for testing the homogeneity of a 
set of k Poisson frequencies seems to be to apply the Poisson index of dispersion 
to those frequencies. The originators of this procedure [1] pointed out that this 
procedure may be regarded as a x’ test of goodness of fit in which the Poisson 
frequencies constitute observed frequencies corresponding to k cells with equal 
expected values. Somewhat later it was shown [2] that the corresponding like- 
lihood ratio test was approximately equivalent to the index of dispersion test. 
Then the problem was approached from the viewpoint of conditional variation 
[3], [4]. This approach permitted exact tests to be studied in some detail for 
small samples. A few years later an exact test for the special case of k = 2 
was introduced and studied [5]. In this investigation consideration was given for 
the first time to the efficiency of the proposed test. Tables of critical regions 
for the test and tables for computing the power of the test corresponding to 
certain alternatives were made available. 

In spite of the desirable features of this last test, it still possesses certain draw- 
backs. First, this test, as well as the others referred to, did not consider the 
problem in which the rate of occurrence of a rare event is constant but for which 
the sampling units differ in size. For example, these methods were not designed 
to enable one to test whether a factory’s accident rate had remained unchanged 
during the past month as compared with the preceding three months. Second, 
in order to use this test it is necessary to possess the special tables or charts ol 
critical regions constructed for the test. 

In this paper a method which does not require special tables is considered for 
dealing with these more general situations. In the course of the development 
it is shown that this method is, in a certain sense, the best method possible for 
testing the hypothesis of homogeneity against one sided alternatives. Since this 
paper is principally concerned with removing the undesirable features of the 
method advocated in the last mentioned paper, it is advisable to read that paper 
in conjunction with this one. The procedure to be followed here will be to derive 
a uniformly most powerful test, show that it is equivalent to a x° test, and then 
compare it with the previously mentioned test. 


2. Similar regions. In the following two sections a study will be made of the 
efficiency of a generalization of the critical region proposed in [5]. For this 
purpose let x and y represent sample frequencies from two independent Poisson 
distributions with means m, and m,. The probability of obtaining this sample 
is given by 

wl ti 
(1) P(a, y) = — . isi. 
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Following the notation and procedure given in [5], let 


Mz 
(2) B= Mm, + M, Pm: +m,’ n=xr+y. 


Then algebraic manipulation will show that P(x, y) reduces to 


; a ey” n! Ir n—z 
(3) P(x, y) = — zin — zi? 2 


The hypothesis which it is desired to test is that 
(4) oe 


where r has been specified. The value of r will often be the ratio of the sizes of 
the two populations under consideration or the ratio of the time units of the two 
samples. In many situations the alternatives to (4) which are of interest will 
be one-sided. For example, after a factory has instituted a safety campaign, 
it would be of interest to see if the rate was unaffected as against the possibility 
of the rate having decreased; hence the alternatives to (4) would be 


My 

(5) a <n 
In terms of the parameters introduced in (2), the hypothesis (4) and its alterna- 
tives (5) become 
. 1 1 
(6) Y £ ae I+r’ 

Consider the probability given by (3) in much the same manner as was done 
in [5]. This probability depends upon two parameters, u and p, only the latter 
of which is specified by the hypothesis; consequently if critical regions inde- 
pendent of u are desired, it will be necessary to find similar regions [6] with respect 
to w. Since x and y are discrete variables, it is not possible to find similar re- 
gions of arbitrary size; consequently it will be necessary to introduce continuous 
approximating functions if such regions are desired and if best critical regions 
are to be found. Toward this end consider the expression for P(x, y) in (8). 
It states that the probability that x and y will take on specified values is the 
Poisson probability that the sample point will fall on the line x + y = n, multi- 
plied by the binomial conditional probability that the point will have the specified 
x coordinate when the point is known to lie on this line. If p and n are not small, 
this binomial function could be approximated well by means of a normal function. 
Or, if desired, factorials could be replaced by corresponding gamma functions 
and the necessary normalizing factor introduced. Regardless of what con- 
tinuous function is chosen, a region on each line x + y = n (n = 0, 1, 2, ---) 
can be selected such that the conditional probability for this approximating 
function is a that a point on that line will lie in that region. Most natural 
approximating functions would become trivial for n = 0; therefore it may be 


and p> 
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necessary to choose an artificial function for this case or to adopt a convention 
of letting the origin be the critical region for this case but accepting only 100a 
percent of samples for which n = 0 as belonging to this critical region. The 
totality of such a regions will constitute a critical region of size a which is inde- 
pendent of » because from (3) the probability of a point lying in this critical region 
would now be given by 


oo a” Cs) "sn 

$e ema Ee nw 

ao | on! no 1! 
Thus, similar regions with respect to u of size a can be obtained by selecting 
regions of size a on each line x + y = n. 

The preceding method for obtaining similar regions is the only method for 

doing so if such regions are restricted to be found on the lines x + y = n, because 
if aregion of size a, were selected on each line x + y = n, it would be necessary that 


co pH Nn 


eu 


ae wl 
independent of u. This is equivalent to requiring that 
co n 
Qn Mh 
i n P 
ga Dee, 
n=0 @ TN: 


but since the power series for e“ is unique, it follows that a, = a. 


3. Common best critical region. Among these similar regions there will exist 
a best critical region for testing the hypothesis p = po against the single alterna- 
tive p = p, if there exist best critical regions on each line x + y = n. From (6) 
it will be observed that this formulation is equivalent to testing the hypothesis 
r = ro against the single alternative r = 7. The best critical region [6] on such 
a line, if it exists, will be that region which satisfies the inequality 


(7) f(x; po) 


fle; p) =” 


where f denotes the continuous function selected to approximate the binomial 
distribution on this line and k is a constant determined so that the probability, 
under the hypothesis p = po, will be a that a point on this line will lie in this 
region. If the normal approximating function with m = np and o° = npq is 
used, (7) becomes 


ae 1p (z—np},)? (z—n po)? 

(8) 4/22 eel mpin ae J < k. 
Po Yo 88 

After completing the square in z, it will be found that this inequality reduces to 


1 n(1/q,—1/q0) r 
—[(1/ —1/ i ae 
(9) entt/ Pia poaol[ # I/piai—l/ pow < C, 


where c is independent of x. 
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If x is a value of x such that 
(10) Pix > x0 |p = pol = a, 


then (9) will hold for « > x provided that p: > po. To demonstrate this fact, 
it is convenient to consider the three cases pp) + pi 2 1 separately. If po + 
Pi > 1, 
1 - 1 1 1 1 
--~->0 — >0 --->—-—|, 
qo rag Po Yo qi 7 Pidi Pod 


and thereforexz <n <n (" - ) / i =). Since the coefficient 
P11 Do do . 


of the brackets in (9) which soe x is positive, increasing x will reduce the 
left side of (9). If po + am < 1, 


1 1 


Pi Po Yo 


and 
n(1/qi — 1/q) 
1/pig: — 1/p0q 


Since the coefficient is now negative, increasing x will reduce the left side of (9)- 
Finally, if po + pi: = 1, (9) will reduce to 


sdk) cs. 


Since 1/p; — 1/po < 0, increasing x will decrease the left side of this inequality. 


It therefore follows that the region defined by (10) is a best critical region for 
every alternative of the form p: > po on the line x + y = n. The totality of 
such regions for n > 0, together with the previously mentioned convention for 
n = 0, then constitutes a common best critical region among all possible similar 
regions for testing the hypothesis (4) against the set of alternatives (5). 

In a similar manner it will be found that if the inequality in (10) is reversed, 
the critical region so defined, together with the convention, will constitute a 
common best critical region for every alternative of the form pi < po. If the 
alternative hypotheses consist of p ¥ po, there will not exist’a common best 
critical region using these approximating functions. 

The critical region proposed in [5] is that for the special hypothesis pp = } and 
the set of alternatives p ~ po. It will be found that the lower half of this critical 
region for P = 2a will differ little, except for very small samples, from that given 
by (10) for this special case; however, it possesses the disadvantage of being 
numerical and therefore of requiring a special table. The critical region given 
by (10) does not possess this disadvantage. This fact will be demonstrated in 
the next section. 


4. Chi-square test. Consider the problem of testing compatibility between 
observed and expected frequencies in two cells. Let x and y represent the ob- 
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served frequencies and e, and e, the expected frequencies in a sample of size n. 


If the probability that an observation will fall in the first cell is, as in (6), p = 








1 
l4?’ then 
7 +r 
and 
r(x + y) 
> = nl — oe eee 
ee = n( 2) i+r 
The chi-square function for testing compatibility then reduces to 
2 - (0; — @e; F a= ra) 
-” . 2, e; ry +2) - 





5 10 15 20 
FIGURE 1. 


Let x3 be the value of x” such that P[x’ > x] = 2a for one degree of freedom. 
With x’ replaced by x} in (11), this equation determines a parabola in the z, y 
plane. Ifx-+ y = nisnot small, the probability of a point on the line x + y = n 
lying outside of this parabola will be approximately 2a, the accuracy depending 
on the accuracy of the x’ approximation, and hence the probability of a point 
lying outside of and below this parabola will be approximately a. Thus, a critical 
region for testing p = po against p > po will be given by that part of the positive 
x, y plane which lies below this parabola. In Figure 1 the lower half of this 
parabola for the special case of pp = } is indicated by the symbol x’. The critical 
region for the alternatives p < po would be the region lying above the upper half 
of this same parabola, while the critical region for the alternatives p # po would 
consist of both of these regions at the 2a level. For one degree of freedom, x 
has a standard normal distribution; consequently the critical region given by 
(11) is the same as that given by (10) in which a normal approximation is used 


~~ aor FS 


a 
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on each line x + y = n. This equivalence is easily verified by replacing y by 
n — xand r by q/p in (11). 


§. Likelihood ratio test. The chi-square test of the preceding section yields 
a common best critical region for testing (4) against (5) for the normal approxi- 
mation. It is interesting to compare this critical region with that obtained by 
the maximum likelihood principle, which requires no such approximations. 
Consider, therefore, the two dimensional parameter space 


Q: m, > 0, my, > 0, 
and the subspace 


je ive 

Mz 
Maximizing P in (1) over © yields mz = x and m, = y. Maximizing P over w, 
treating P as a function of m,, yields mz = x + y/1 +r. Then the maximum 
likelihood ratio becomes 


zt+y 
e @tw x + y yr ) 
— max Pw _ l+r oe ss 


max Po rly! rly! 





r 


This reduces to 


_(z+wV" 
an = (Et 


For a fixed value of A, this equation determines a curve in the z, y plane which 
may be used to determine a critical region. Since —2 log \ is known to possess 
an asymptotic chi-square distribution under certain conditions [7], choose as 
critical region that part of the positive x, y plane lying below the curve determined 
by (12) when \ has been replaced by Xo , where o is determined from —2 log A» = 
xi. This curve may be plotted by reducing it to the parametric form 








e | 


+0 r 

(1 + v) log re. +uv log - 

A comparison of the critical regions corresponding to (11), (12), and a slight 
modification of [5] for the special case of pp = 3 and a = .05 is given in the accom- 
panying sketch. The modification of [5] consists in choosing 2 to be that integer 
which most nearly satisfies (10), rather than to be the smallest integer for which 
the left side of (10) does not exceed a. The latter method of choosing zp» has a 
tendency to make the first type of error considerably smaller than a@ for small 
values of n. It will be observed that there are no appreciable differences between 
the maximum likelihood and chi-square critical regions. Furthermore, it will 
be found that there are only two values of n, namely n = 3 and n = 9, forn < 30 


’ 


= 
I 


vz. 
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for which the chi-square test and the modification of [5] might yield different 
decisions at this significance level. 

The preceding sections show that the chi-square test is highly satisfactory for 
testing the homogeneity of two Poisson frequencies, except possibly for very 
small frequencies, and that therefore special numerical tables are not necessary. 





6. Several Poisson frequencies. The generalization of (11) for a set of k 
frequencies is, of course, the ordinary chi-square function 


2 (1%; — Di) 
13 = > Pee ee 
( ) ” i=l NDpi ; 
k 


where n = >, 2;, pi is proportional to the sampling unit from which 2; was 


i=1 






k 
obtained, and >> p; = 1. The Poisson index of dispersion is merely a special 
t=1 


case of (13) when p; = 1/k. The adequacy of (13) for this special case has been 


studied elsewhere [3], [8], while studies of (13) in general are numerous and well 
known. 
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The main problem considered here may be stated as follows: 
Let fi(x), --- , fn(z) be n polynomials. It is the purpose of this paper to 
establish formulas concerning the mathematical expectation (probable value) 
of the product 







filar) a SAz), 


where 2; , --* , 2, are positive random variables and the sum of these is supposed 


known. 
Before establishing the formulas let us introduce some notations for con- 


venience. 





was 






cial 


een 
vell 







1. Notation. (A) In this paper the notation (m;k;2, +--+, %n) or (m; k; x) 






is used to denote that a set of numbers (x , «++ , Xn) is over all different composi- 
a tions of m into n parts with each x 2 k, i.e. over all different integer solutions of 
ah the equation 2; + --- + 2, = m with each x 2 k. 





(B) Let m, 5 be two positive real numbers. The notation E(m, 6, [fil --+ [fnl) 
denotes the mathematical expectation of the product fi(x1) --+ fn(@n) in which 
the sum m = 2 + --- + 2, is known and for every z,(v = 1, --- , n) the value 
of z,/6 is a positive integer. The notation E(m, 6, [fi] --- [f,]) thus implies that 
the value of m is a multiple of 6. We call the 6 a “‘varying unit”’, i.e. the least 





: of 





‘ies 







C., . . . . . . . _ 
possible difference between two different quantities x; and z;i ~ j. The nota- 
m tion E(mé, [f]") is merely a special case that denotes the mathematical expecta- 
tion of the product f;(x1) --- f,»(a,) under the known conditions 
& 
| hahah nti tnem &=[#]21, 
p. (v= 1,---,n), 





where [ ] represents ‘‘integral part of”’. 
(C) In order to simplify our formulas we always denote f(x) by f™, f,, + -+: 
+ f., by fh, ...», and 1.p: + --- + k.px by o(p) or c. It is a convention that 


) = O form < n. 






n 












2. Lemmas. Lemma 1. Let m, 1,-::,1n be non-negative integers. Then 


oft) m+n-—1 
(1) on, 11 (**) “(4 -e t+mntn— 2 
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Proor: The lemma follows immediately by considering the coefficient of the 
term 2” “'*'""*™ on both sides of 


1 ry+l 1 Tntl 1 ryt--stryatn 
es 2G et -(-5) | 


Lemma 2. Let a, b, c, --- be any constants, and ky, ko, kz, -- 
integers. Then 


& Hl) +°@)+e@)+~] 


«a! 7 ( m+rn—t )S55- 
" (nidieBry,--) NAR, + Bho + yk3 +--+ +n —- 1 a! B! y! ‘ 


Proor: Expanding the left-hand side of (2) we see that the coefficient of the 
term a*b’c’ --- is equal to 


ae < waa 4 Pe ee ~ en .. ( %at8+y \ 
ae ! (mioce) \Ai ky ko ko ks ks 


By Lemma 1 it becomes 





- any positive 


(2) 


n! ( m+n—l ) 
a! Bly! \aki + Bke + yk3 + --- +n —1/)° 
Hence the lemma. 


Lemma 3. Let m, n(S m) be two positive integers. 
nomial f(x) of the kth degree, we have 


m+n-—1 k [(f — 1) le a 
(3) Da fla) «++ Sen) a eS ty Boy 


v=0 pt 
where (o = f(x), o = o(p) lpit-:-- + kp. 
Proor: Since f(x) is a polynomial of the kth degree, there exist (kK + 1) values 


Be, °** , Bo Such that 
k ce 
t=0 a 


By putting x = 0,1, --- , k, it is orderly determined that 


By = - = oe es -+ (—})" ( "\" = (f - 1". (py os 0, 1, ae. _k). 


The lemma is thus obtained by (2). 
For convenience we denote the summation : 2 (m; 1; x) filai) --+ fa(an) by 
(m;1;2) 


S(m, [fi] --- [f,]). Thus the formula (3) can be written as 


a os (¥) 1 Pp 
Sim, [f") =a! oo. Vay 1)?" 


(np) \O + 1 v Dy! 


Then, for any given poly- 
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Lemma +. Letfi(x), --- ,fn(x) ben given polynomials. Then 


4) Sim, [fil --- ffl) = = > (-1)"* Sim, [f,, +--+ +fy)"); 


n! (¥1--+¥n) 


l1<k<n 
where (1 --- vx) runs over all different combinations out of (1---n), k = 1, 
in. 
Proor: The proof depends essentially on the formal logic theorem. Con- 
sidering a typical term 
! 


Sim, if) --- (f,)"), l1<t<n, Mtr tHe =N, 


we see that it is contained in the last (n — ¢ + 1) summations of the righthand 
side of (4), i.e. in the summations (y, --- »%,) ask = t,t + 1,---,n. Thenum- 
ber of occurrences of the term in the right-hand side of (4) is therefore 


n—t . 

x (-1)" n—t\_0 if t>n 

y=0 v 1 if t=n. 

The term vanishes generally except when q = --- = q: = 1. Hence the right- 
hand side gives 


S(m, (fil «~~ [fnl)- 


3. Theorems with formulas. In the following statements of theorems and 
corollaries, the notation (x --- 2,) is always to denote a set of undetermined 
quantities, though the kind of the quantities of the set is stated. 

THEOREM 1. Let (x - ++ Xn) be a set of natural numbers under a known condition 
m+:++ +2, =m. Then, for any given polynomial f(x) of the kth degree, we have 


‘ . i m+n-1 wat 
(5) E(m, 1, [f]") = C = ' a. (” ae ye mania 
n— 1 . 


Proor: Let m’ = m+ nr. By lemma 1 we then have 


vy In\ _ _ (m' 
o (3) ok i - Bas 7 ( 


This is the number of compositions of m’ into n parts with each part 2 r. In 
particular, for r = 1 we see that the number of compositions of m into n parts is 


n-—!] 
value is equal to 


(” 7m > Thus by the definition of mathematical expectation, the required 


S(m, [f]") ‘ m—1\" " 
sma i (GIy) smu 


The theorem is therefore proved by Lemma 3. 
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Corouuary 1. Let (2 --+ Xn) be a set of positive quantities, of which the vary- 


ing unit is 6, and the sum is m. Then, for any given polynomial f(x) of the kth 
degree, we have 


! = — ] k wa (¥)1Py 
6) Bm3,")=7™~ y fe" |e 


7 oa (n30;p) Cc -f n—]1 v=0 pv! 


n— I! 
where 
g(x) = f(éx), o=Ip+--- + kp. 
Proor: It is deduced by the relation E(m, 6, [f(x)]") = E(m/6, 1, [f(6x)"). 
CoroLuary 2. Let (x, --+ xn) be a set of non-negative real numbers under a 


known condition x; + --- + 2%, = m. Then, for any given polynomial f(x) = 
Qa +--+ + a,x", we have 


(7) «Em, 0, [fl") = _m Ola) (eta) 


n 2, (ctn—1)! @q! qui” 








where 


a, ~ 0, c=o(M =at::--+ka. 
Proor: The proof of the corollary depends essentially on the concept that two 
different real numbers may differ by an arbitrarily small number h. 
Let h be an arbitrary positive number and let f(ah) = h*g(x, h), where the 
number k is the degree of f(a). Then, since 
0 if p>n 
: ' “ a 
> (-1"(”) (n —Prat ;, ul p=n 
_ ( 9 ) nm if p=n+1, 


we may write 
> (-1) @ g(v —3,h) = h’ “[vla, + h-R,(h)I, 
s=0 


where lim R,(h) = F : ') on 


Now we pass to the limit h — 0, in which it is assumed that / runs through a se- 
quence of rational numbers of the form 1/N. Thus by Corollary 2 we have 
o k 
s . m (v!a,)”” 
lim E(m, h = ni(n — 1)! po , ; 





Hence the corollary. 


It may be noted that this corollary can also be independently deduced by the 
proportion of the two integrals: 
a 
R 


| a [ s@ 6+ + f (tn) day +++ dta—a: 
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where the integrals are all taken over the region R: x; + --: + 2%, = m, % > 
6, --- 2, 2 @. 

Corouuary 3. Let (a, --- an) be a set of positive real numbers under a known 
condition a < a+ +--+: +a, < b, where a, b are non-negative numbers. Then, 
for any given polynomial f(x) = a, + +++ + a,x" (a, ¥ 0), the mathematical ex- 
pectation of the product f(x1) «++ f(xn), which we denote by E((ab), 0, [f]"), is given 
by the formula 


E(a, b), 0, [f]") = => 


(8) 


1+0(q) 1+¢(q) 
pre _ give ai” 


a4 (1 + o(@)-(n— 1+)! qo! 


Proor: Since the required mathematical expectation is the mean 


1 b 
“hs 7 E(u, 0, [f]") du, 


Corollary 3 follows from Corollary 2. 
On the other hand we see that 
lim E(a, a + h), 0, [f]") = E(a, 0, [f]"). 
h-0 
Hence Corollary 2 can also be deduced from Corollary 3. 


TuroreM 2. (First generalization of Theorem 1). Let fi(x), --- fr(x) be n 
given polynomials, of which the highest degree is k. Then we have 


Bm, 1, (fle = D(H 


Vy-+-¥s) (250; p) 
l<sssn 


? 


(" +n-1 
x Not m= IY Ty (Garey — DY 


m — 1 Pu! 
n— |i 
where 


Proor: In the proof of theorem 1 we have seen that 


E(m, 1, [f]") = c a 7 S(m, [f}"). 


Thus, by similar reasoning and lemma 4, we have 


m-— 1 
n—1 


Vy-*+ Ms) 
l<s<n 


E(m, 1, (fil «++ Ufal) = > vat! le S(m, [fo,--»]")- 
Gn) 














374 L. C. HSU 


The theorem is proved by lemma 3. 
Corouuary 1. Let 6 be a varying unit. Then 


E(m, 8, (fil «++ (fal) = nie 7 {-1)" 


coe (30; p) 


= +n-l1 
(10) (: ‘ lata ) k [( ae 1)°}* 
o Il Jvy---¥, 


Xx ’ 
a u=0 Pu! 
) 
n—1 


g(x) = f,(6z), Jry---¥, = Jn, + °° + Gr, - 


Proor: By the relation E(m, 4, [f; (z)] --- [fn(x)]) = E (m/6, 1, [fi(6x)] --- 
[f.(5x)]) we obtain the corollary. 
Corouuary 2. For any positive real number m, we have 





where 


Pi]... [pPn]\) — Pi! --+ Pal(n — 1)! _ pp Pite+Pa 
(11) E(m, 0, [2] «++ [aP]) = Pt" Pal — Ug 


Proor: Since E(m, 6, [fil ---[fn]) = >> (—1)” */n! E(m, off,, -- + »,]”), we have: 
by letting 6 — 0, 


E(m, 0, ffl --- he) = ¥ eo E (m, 0, [fy,...»,]”). 








The corollary is therefore deduced by (7). 

THEOREM 3. (Second generalization of Theorem 1). Let (a ---+ xn) be a set 
of integers under known conditions x, + --- + 2%, = m,a < x; < b, where m, a, b 
. are given integers. Then, for any given polynomial f(x), the mathematical expecta- 
tion of the product f(x) --- f(xn), denoted by E (m, 1, [f]", is given by the formula 

(ab) 


+ (- mt ) sim (gl'(hI"~’) 
(12) E (mit) == eo y 

yew ()(e 77) 
where 


g(x) = f(b+ 2), h(x) = f(a+ a — 1) and m’ = m — (a— 1)n+ (a — db — 1)». 


i ; : 0, _ Oform> 0 : 
Proor: Define S(m, [f]") = 0 for m < n, and S(m, [f]) = “ego We 





shall now prove that 


x (7) sew, (oth) = flan) «++ flan), 


(71°: -In) 
a<z<b 
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where on the right-hand side of the expression the set (%, --- 2,) under the 
summation runs over all different compositions of m into n parts and 


esa & & y = 1,---,n. 


For convenience we denote the left-hand side of the expression by ©, that is, 


S =H (-'(*) som’, or I) 


v=0 


“-_* ’) dX Slim, [fa + b)") Sim’ — mi, [fa + a — 1)I"”). 
Let f(#:) --- f(%,) be a product term contained in ©, ie., % + --- + E, = m; 
% 2 a,---,% 2a. We assume that z,, > b+ 1,---,Z,, > b+ 1, where 


vy, ~ vjift #7. Then it is seen that the number of occurrences of the product 


term in © is given by 
t e 
s(t 4 t > 1 
yi-y (‘j= ¥ fe 
San 8 ff t=0. 
Thus the product term f(%:) --- f(Z,) of GS vanishes except when 


ef a & & y = 1,---,n. 


Hence we have 
S= LD sla) ++ fa). 


Next, we shall find the number of different compositions of m into n parts with 
each a < x, < b,i.e., the number of product terms of ©. By the above result 
we see that the number is given by 


2 & ao (’) oe . me ™ 2 (—1)’ ("\(", > > 


Hence the theorem. 
This theorem shows that the mathematical expectation E (m, 1, [f]") can be 
(ab) 


expressed by S(m[g]’) and is therefore expressible in terms of linear combinations 
of the coefficients of the polynomial f(z). 


m 


Coroutuary 1. Let 6 be a varying unit for which 3 


are all integers. Then 


a 
+a? 


Bm ser) =, (™, ise"). 
(ab) ((a/8),(b/8)) \ 8 


Corotuary 2. Let fi(x), --- fr(x) be n given polynomials. Then 





E (m, 1, [fil San L fn]) sa i a E (m, 1, [fr,.--»,]"). 
(ab) (94-96) nN: (a,b) 


lsssn 
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CorRoLuary 3. The number of integral solutions of the equation x, + --- +2, = 
m with a, < a < bi, +++ 5 dn < In < bn is equal to 









1,---,l 


Zz. (—1)"t ts 


vy =0,°--,¥n=0 


ae + dn) + (a1 —b, — 1) +--: aN 


n-—1 













Proor: We have shown that the number of integral solutions of the equation 
M+ -:+ +2, = mwitha < 2, < bis given by 


- y(n\fm — (a—1)n+ (a-—b-—1)—-1 
2, (~8) . 
v=0 v n 1 
Hence the number of integral solutions of the equation x + --- + 2in, + 
cso Ia + oes + Xen, = m witha, < 2, <6, (9 = 1--- 3,4 = 1, --+ nm), 
is given by 


= wad . (yt Il (") 


v,=0 v,=0 i=1 


/ 7 — = (a; = 1)n; + (a; ~ b; — 1)v; = ‘) | 
| =, II ( ~~ = 1 


a “—~ (—ytt () a ("*) 
v1 =0,---,¥5=0 Vi Vs 
m — (a — 1)m — --- —(a, — 1)ns 
+ (a, — b, —1)y + --- + (a, — bs — 1), — 1 
mt ta, — 1 

The corollary follows at once by putting n, = --- =n, = 1ls=n 

This corollary can be restated in a more interesting manner as follows: 

Let there be n store rooms, and let b; , --- , b, be the numbers of stocks con- 
tained in Ist, 2nd, --- , n-th storerooms respectively. Then m stocks contain- 
ing at least a; stocks of the 7-th storeroom (¢ = 1, --- , n) can be chosen from 
these n storerooms in 














m+n+(aq—b —1)n+-:-:- 
. > (a, ao bn = 1)yn — me ses e Ge |g 
S ceyrel 


1 =0,+ + +P n=O n—1 









different ways. 
- So far we have established several combinatorial formulas concerning the 
mathematical expectation of the product fi(x:) --- fa(v,) under certain con- 
ditions. In the next section, we shall explain how to apply these formulas. 









4. Applications. (a) A criterion. In order to make the above formulas 
applicable to practical problems we state a criterion as follows: The mathemati- 
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cal expectation of a function F(x, ---,2,) can be estimated by the above 
combinatorial formulas if and only if the sum of these undetermined quantities 
a1, °** ,4n is known and there exist n polynomials fi(x), --- , fn(z) such that 
Fefi,:::,F«f,, where the quantities x, --- , x, may or may not be conti- 
nuous. When the quantities are discontinuous, the varying unit is certainly 
given. 

(b) Some approximations. For f(x) = Bo + -:- + B,."(B, ¥ 0) we may write 


k 


(f —1)” = > v!B,S,.e, 


s=0 


where S,,, is a Stirling number of the second kind, as used by Jordan, and de- 
fined by 


v! Sys = 7 (—1)’” @ x ; 
z=0 x 
Thus, the formulas (5) and (9) can be written as follows: 


(730; p) (m — a)!(o +n — 1)!(m — 1)! 
(By Sv» + +++ - Be Syn)” 
= 
E(m, 1, [fil eo [ fal) a , >» Zz (—1)"* 


vy°*¥s) (n;0;p) 
l1sssn 


iniin~ fete Pe~ ee- 
(5') 


9’ 
w) (m+n — 1)!(m — n)Inl(n — 1)! Il (By 8,» - 15 > B, S,x)”” 

(m— a)'(o +n — 1I)!(m— 1)! 5<0 Dr! ; 
where 


S., = vI8,., fi = Bio +--+ But", B; = Bu +++ + Bu. 


Now we state some convenient formulas concerning the number §,, . 


If m is sufficiently large and ¢ is smaller than m, the following recurrence rela- 
tion is useful: 


Snnmet—1 = do - ae ix ') + »(” a . ') 
4 +n ae ') 


2t —2 
“ m+t m+t 


tee. + [(2¢ _ 1)At~2 + 1-1] P2 , 


where A, = 1, \;-1 = O and \y, --+ , Ave are all independent of m. 
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Starting from the first equality and using the recurrence relation Smay; = 
MSmn + Sm-i.n Successively we have 


™ 


OQm,.m+t = >» (m —v+ |) 


v=] 


t—2 ™m a m oe 
= Ed  . ra+i+ 1) +> (" as "G+ 1)| 


x m+t . m-+-t ’ 
-UMNG Ph aetstot( Ta oto] 
_<x a ; m +t ) 


where A_; = Ay = 0. The recurrence relation is thus deduced. 


Writing 
7 _f{m+t m+t . m+t 
Smee = (PFT) AMT a) toot a(" ar): 


and using the recurrence relation as obtained above, the coefficients A; , --- , Ari 
may be exhibited as follows: 
































t Mt | Ao A3 Na As | As | Az | As 
1]. | 

2 3 | 

3 | 10 5 | 

4 25 | 105 105 | | | 

5 56 | 490 1260 945 | 

6 | 119 | 1918 9450 17325 10395 | | 

7 | 246 | 6825 | 56980 190575 | 270270 | 135135 | 

8 | 501 |22935 | 302995 | 1636635 | 4099095 | 4729725 | 2027025 

9 |1012 |74316 |1487200 |12122110 |47507460 |94594500 '91891800 |34459425 


Now let 


_|f(nt+t n+t . n+t 
Bante “(CP i)+xo (rt o)+ + New cw ( oy )]n. 


The recurrence relation obtained above gives 

Arai(t) = (2t — 1)-2(t — 1) 

Aro(t) = 2(¢ — 1)Ars(é — 1) + (€— 1)Arx(t — 1). 
Thus we obtain 


(2t)! 
t! 2° 


Ar-2 (t) = (t — 1)! > rr ae . 


v=l 


Nt-1 (t) - 
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Let 


“<5 ” 
z=] 222 x . 

n+t 

2i—1 


, 


Since the orders of 4 ' - t), +e ) are all less than 2t asn > ~, 


and since 


n+t 
( 2t ~— = 





bo| 3, 


enn, 

—— 

8 = 
M | 

S — 

> ———, 
— 
+ 

| 

8 

Nese 


ro|/ 3S, bo 


|s 
OC -™ 
_ _ 


es 


(1 — O(n”)) 
— O(n) (=) 1 
om” CDNB) 
. n+t 1 
Sew sl )a-pre A(t) 


a _. 2t\" (a ber 


! 


a ( -§) 
:) 


ll 
_ 
+ 








t 


- 80, (2) ys 
s'0c0(% ) + O(n™ a s. 


2/7, ao 





We may write (by Stirling’s formula) 


nn ONE (ie EL es) 


e 2 t! 
where ¢, ~Oasn— ~. 


Now it is easily proved that the inequality 


x. « (22 xz-—1 
f= > = (7) > 4/ - 


holds for every positive es xz. We have, — 


wW<y t<[ 4/2a=55 57, — 0; 
w> L/t>[- ia -3.0-0 
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and 
an 2i\7} t a 
<4! < > vr. 
Vie 9) Vi-i Vv" 
Using these inequalities we have 
_2 1- giay 2 i en 
b= 3 Vt (t+ 2) <4 (7) a) <34/ = « 1) = u, 
where it may be noted that 
lim “! = 1. 


to i 


Hence we have in conclusion 


a) (-y (”) ott = @) ey Vim é Lk) +tte *), 


e Z t! n 


2 k(t) 2 t 
gen <A *e8 ¢- 1" ~»- 


Evidently the formula (14) implies (15) and (16): 


Bun) 





where 





(15) 
~(2)' (5 “) var (4 +2), t= O(n), €>0. 
e 2 
(16) nin (2) V 2rn. (Stirling’s formula). 
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ON THE CONSTITUENT ITEMS OF THE REDUCTION AND THE 
REMAINDER IN THE METHOD OF LEAST SQUARES 


By S. Vaspa 
London 
1. Consider a set of variates y;, (¢ = 1, 2, --- , ), which are normally and 
independently distributed with variance 1. Let also a matrix (xz) with 7 = 
1,2,---,n;k =1,2,---,sandranksbegiven. Findb,, --- , b,in termsof y; 


so that 
y= Z (yy: — 2 rindi) 


isa minimum. This minimum value shall be denoted by Yin . 

It is known (see e.g. R. A. Fisher, ‘‘Applications of Student’s distribution’’, 
Metron Vol. 5, Part 3 (1925)) that Yuin Varies as does x? with n — s degrees of 
freedom and that it is possible to express Yxin aS the sum of n — s squares of 
linear functions of the y;. In the following lines p> y; will be expressed as the 


sum of nm squares of such functions which are independent and of variance 1. 
The sum of the first s squares will equal > y; — Wain and therefore the remaining 
nm — s squares equal Pirin . 

Thus a simple way will be found of writing down explicitly the linear functions, 
whose existence only was proved by Professor Fisher in Metron. 

2. We first calculate Yin . 


~~ = 0, for 7 = 1, 2, --- , s, gives the normal equations 
1 


(1) a Lyi = a dX Lr Lindy ’ 
which can be written 
(2) 2d Tayi = a Xx by 


with 
n 
Xn = Z Lit Lik « 
i=l 


It follows from (1) that 


n 8 s 


(A) Pain = 2d yi - > Lirtinbsb, = dX yi — a 2d Xinbrb: , 


i=1 l=1 k=1 


where the 6 are solutions of (1). 
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3. A second expression for nin can be found as follows: 
Introducing 


8 
“= Z Lindy 
k=1 


we obtain from (1) 


(3) Do tire = Doi ys » ( = 1, 2, --- 8). 
Now if zi, , (u = s+ 1, --- n), are any n — s independent solutions of 
LD %uta = 0, (i = 1,2, ---, 8), 


i=] 


then the c; satisfy also 


(4) D se wt, (u=s+1,---n). 


i=1 


Let such a set of z;, be chosen. Then (3) will be solved by 


(5) qj i = 7 Av Ziv 


with , as indefinite factors and these c; satisfy (4), if 


n n n n 
D> Zu Ys = 8 a (u=s+1,---n), or Dd zu Ys 
i=l o=s+l1 i=] t=1 


(6) 
— > i Zu No 


v=s+1 


with 


n 
Zee = Zz. ZiuHiv > 


i=] 
Because of (2) the equation (A) can be transformed into 
Vain = dX Yi — 2d Do rirysbs = a Yi — d Yes = dX 2, No Ziv Ys 


which is, because of (6) 


n 


(B) Vinin = Z 2 Bue Nu des 


u=s+l1 v=s+l 


where the d are solutions of (6). 
The comparison of (A) and (B) gives 


> yi = 2 2d Xubibe + Z 7 Zuv duro 


i=l u=s+1 v=s+l 


1 SR ERO ET 


Sree? BEES ME TRY 





8). 


8), 
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where the first form on the r.h.s. shows the reduction of >> yj by the method of 
t=1 


least squares and the second form constitutes the remainder. 


4, These two forms must now be expressed in terms of the y; . 
We introduce the notations 
| Xa Xe Xun -+: Xie 
x” = Xn x? = . x) a 
7 Eu. ‘ sn , pa ee bec ke cenaas 
| Xo Xoe | |Xor°** Xee| 


and 


Z sical 
r7(s+1) r7(s+2) | s+] s+1 4stls+2 | 
Z* = Zetis415 Z = | ete. 


| Zet20+1 Zs 28+2 | 


It is well known (and can easily be verified) that. 


8 8 1 Ps 
YL, Xubibe = a (Xubi + +++ + Xuvd.)? 
l=1 k=1 “ 


= Te | Xu X22 | | Xa X16 | , 
+ xX x® ( Xo, Xoo | be + ese + | Xn Xos | b, 
1 r(s)2 » 
+---+ Y= yo xX” 0b; 


which may be written 


s 2 


X Xix dy 
1 P ’ 1 To» 1k Ok | 


x (= Xubs) + x” x® o | 
Xo DXebs| 
= 


1 k=1 
ae eee |. wie x” sees eeceenseeoseseoeeee 
< a s 
Xa Xeo--+ Dy Xude| 
k=] | 


Using (2), this can be expressed in terms of the y; instead of b; as follows: 

n 2 

* Xi Dri Ys | 
1 2 P 1 i=l 

Hl Do tay) + >ave | 
a juz] X z n 

Xai dt Yi 

i=l 


(7) n 12 
XuX12 °° p>» Tayi | 


1 i=l 


yer 1) y(s) 


‘é | 
Xe: Xs2 aA 7 Lis Y; 
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Similarly by (6) the second form can be transformed into 


' n \2 
Zerist1Zeyt s+2° °° Do Ziet ys 
1 n 2 1 | Ton] | 
2 7 ; i a aca | 

(8) Yinin = gen (> zeit) + + OD Ze | : | 
| Zns+t Zns+2 a Ye Zin Yi 
| t=1 | 


The rank of (xix) is s, so that the order of the suffices can always be chosen 
so as to make the above denominators different from zero. 

Thus both the reduction and the remainder have been expressed by sums of 
squares, whose numbers correspond to the “degrees of freedom” s and n — s 
respectively. 


5. It remains to be shown that the linear functions of the y; appearing in each 
form are mutually orthogonal and that in every one of them the sums of the 
squares of the coefficients are unity. 


n 
Now if we call the n linear forms which occur above >, aijy;, (i = 1,2, --- ,n), 
j=l 


then our proof implies that 


n n n 2 
Ywi=L [Law| - adn VsUe 
i=l j=l 


i=1 i=1 j=1 k=1 


This is an identity for any y;, hence we must have 
> a;a,;=1 if j=k, and 
t=1 
=0 if j#k. 
We have thus shown that the matrix (a;;) is orthogonal and it follows that 


> aj,ax=1 if j=k and 
i=l 


0 if JFK. 


6. In practical applications the 2x, will be given and if the expression (7) or 
(8) is to be written down we must first solve the set of equations 


- 71 


> Zu ta = 0, (i = 1,2, --- , 8). 
t=1 
We may assume that 
U1 Us 
swwsnnnen| Oe 
| Lis Lee | 


my 
v 


ne 
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There exist, of course, an infinity of solutions. A very simple one can be found 
if the matrix (xj) is completed into a square matrix by adding 1 in the diagonal 
places and 0 elsewhere. We obtain 


V1 sy Vs411 Lns | 
CCGAC 4S SDH ES OOS OSES DS | 
Vis Uss Us41 8 Uns | 
| #0 
0 0 I 0 
0 0 0 1 
The minors of the terms of any of the s + Ith, --- nth line give one of n — s 
independent sets of solutions for the Zi, . 
If, eg. s = 1, then the z;, are 
—Ien Fi 0 0 
— in 0 Vi 0 
—~ Jes 0 0 ma t?* 
ete. 
and the Z are 
2 .2 > J 7 
Xi + Xn, 21031 5 21041 
2 2 
X21%31 ; Li + X31, X31041 
2 2 
X10 41 , 31041 y ti + ru 
ete. 
Hence, for s = 1, n = 2, 
= 1 l ; 
2 2 eee NG 2 ; J ' 2 
Vnin = Zz us 3 2 (Cuyi + Lay2) = 2 2 ( — tay + Luy2) 
i=l Li + X21 X11 + Xo 


andfors = 1,n=3 





2 
Vnin = Yi — 2 2 2 
- i=l ; Mi + Va + X31 


Sit — Gut tas + cans) 


2 2 2 
tut in — Layi + -Luye 


| 


Kents Lai + LuYs! 


1 i 

= 2 ( — tayi + Zuy2) + . . 

Liu + Xn 2 2, |i + Ta %21 X31 
(11 + 2x21) 


2 2 | 
X21 U51 Li + Xa: | 
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If, however, s = 2, n = 3, then easy calculations lead to 


n 


y2 > y 2 (a1 Y1 + X21 Yo + Xs Y3)” 
a =~ 2 2 2 
rs = Vir + X21 + X31 


9 9 9 ' 
X11 + Lar + X31 Liu Yr + Layo + Xa Ys | 





. ’ . . 
| X11 X12 + Xo Xo2 + Lai Xs2 Lia Yi + Xo2 Yo + X32 Ys| 


2 2 2 | 
2 2 2 ri + Lor + X31 - Lug + Xo1Xo2 + X31 X39! 
(11 + Xo + X31 


2 2 2 
|e Lu + 22X11 + L32%31-- Lig + X22 + Xe 


| | . | 2 
U1 031 | Tai Vi | | Tu Xn | 
“Si \yi + | y2 + | Ys 

Lo9 X32 | | U32 V2 | Xy2 Lee | 
Ra . 2 | a \2 | . 12 
| V21 X31 | Us1 C11 | | Vit Var | 

>| | i >} + | 
. | Vee X32 X32 X12 | | Liz Loe | 
As a specialized case consider s = 1, and ry, = %9 = +++ = X= 1. Then 


the Z are 


aii -11 
1211 1 1 
a 34 -21 
1111 -12 


and 


n 2 


n m { n 2 n ' 
Ynin = 2, Yi = (> ) - 2. ee 2, ” 


i=l i 


n 
The sum of squares into which Yin can be transformed is then found to be 


1 1 9 
a yi + y2) + 2.3 (— y¥1 — Yo + ys)” 


1 “ 
+ A” Yi — Y2 — Ys + 3ys) + vee, 


1 This is the result contained in a paper by J. O. Irwin, “Independence of the constit- 
uent items in the analysis of variance” Suppl. Roy. Stat. Soc. Jour. Vol. 1 (1934). 

















NOTES 


This section 1s devoted to brief research and expository articles, notes on 
methodology and other short items. 


(= ng 


ON THE ANALYSIS OF A CERTAIN SIX-BY-SIX FOUR-GROUP 
LATTICE DESIGN USING THE RECOVERY OF 
INTER-BLOCK INFORMATION 


By Boyp HAaRSHBARGER! 


Virginia Agricultural Experiment Station 


1. Introduction. A detailed description for a six-by-six four-group lattice 
design is given in a recent article [1] by the author, and the analysis is developed 
which uses only the intra-block information to correct the varieties for the block 
effects. Here is developed the analysis that makes use of both the intra- and the 
inter-block information. 

Referring to Group X on page 307, [1], since block (1) contains varieties 1 to 6, 
and block (2) contains varieties 7 to 12, the difference between the means of 
these two blocks is also an estimate of the difference between the first six varieties 
and the second six varieties. The information obtained from such inter-block 
comparisons was ignored in the previous analysis. In attempting to use this 
information, the chief difficulty is to decide how estimates derived from the 
comparison of block totals shall be combined with the previous estimates. 
Since each block consists of six plots, comparisons between block totals may be 
expected to have a higher error variance than the within-block comparisons, 
just as in split-plot designs the main block comparisons usually have a higher 
error than the sub-plot comparisons. The problem is, therefore, to estimate 
the relative error variances of the inter- and intra-block comparisons, and then 
to combine the two types of estimates to the best advantage. 

2. Calculations of the adjusted varietal totals. In addition to the equations 
(7), [1], which contain all the intra-block information, we now have the additional 
set of equations, 

B; = 6% + (sum varietal constants in this block) + e; , which are estimated 


by 


J 


B; = 6m + Xv; + Ej. 


In these equations and all the following equations, the double prime symbol 
(’) used in [1] is omitted, but the statistics have the same meaning as in equations 
(7), [1] except in this paper they are adjusted by both inter- and intra-block 
information. 





1 The author wishes to express his appreciation to W. G. Cochran of Iowa State College, 
who advised in the preparation of this analysis. 
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The general problem is to minimize the function, 





F = WS(yi; — m — v; — b;)? + ¥ S(B; — 6m — 3v,,)° 
36 u k 1 
subject to the restriction >> v; = 0 and >» >: bi: = 0, and where W = —and 
7=1 e==z i=] o~ 
: l 
W’ =-3. 
Tb 
Following the method given in [1] the typical block equations for bx - ++ bg is 
I W | W : 
ba = = ——{,, (4Ba — Tan) = = == WC; 
‘= éaw + Ww Oo = 63w + Ww’ 
and for b., --- bus is 
] ] . . ‘ 
Bag 8 Eee — ee . ~~ ((25W~ + 22 WW’ W)C, 
wae + Woew + ws WW + ill la 


W — W’ 
W+ WwW’ 

It can be seen that for W’ = 0, ba and b,; are the intra-block values given in 
[1] and for W’ = W they are the randomized block values. 

A typical adjustment varietal total then becomes 

‘ W — W’ 
dv, + 4m = | ; == = W (bz1 + by +> ba + bu). 

3. Estimation of W and W’. Following the method presented by Cochran [6] 

and Yates [3], the error of a block total may be written as 


E; = ei + C i2 + See + Cié + Gb; 


+ (W = W’)*(Cis + C.s)] + (Cuz + Cus + Ca). 


where 
V(e) = o and V(b‘) = o4. 


r ’ . 2 92 2 : : . 2 » Ss 
Hence V(E;) = 60° + 360; and component (a) is thus an estimate of o + 605. 
One finds from evaluating the expected value of (15), [1| corrected for replicates, 


; . eee — ° 
B( 0 _ ze that the expected value of component (b) is 7 + 2-605. 


In the analysis of variance if components (a) and (b) are pooled, one obtains the 
block variance B as an estimate of o + 1.605 . Since the intra-block variance 
is an estimate of o” the estimates of the true variance between blocks, o° + 60; ; 
. SB-E 1 

. 


_ 
4. Standard error of adjusted varietal means. The standard error of the 
difference between the adjusted means of two varieties which appear together in 
the same blocks in groups Z or U, is 


l SW 
aw? 9 tay) 
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obtained by the method outlined by Cochran. Similarly, for the case in which 
the varieties are together in the same block in groups Z or U. 

When an attempt is made to express the difference between these two adjusted 
varieties which appear together in the same block in groups X or Y in terms of 
the levels of the main effects and interactions, the interactions are no longer 
unconfounded and the method employed above breaks down. 

If one is willing to assume that the formula for the variance of the difference 
between two adjusted varietal means for varieties which appear together in the 
same block in the groups X or Y is of the form o4 i (4 3W +W ) the 
constants may be determined by the values already known, [1]. This form can 
be shown to be that for a quadruple lattice. 

a 1 BW _, 

Che formula aw (1 + a + - must reduce to the value for intra-block 
analysis [1] when W’ = 0, and when W = W’ to the value for complete random- 
ized blocks. When these conditions are imposed, the formula becomes 


ads (16 ed > 
144 3W + W’ 
This value is slightly larger than the value obtained when the adjusted varieties 
appear together in the same block in groups Z or U’, as should be the case. This 
gives us a lower limit. One can arrive at the upper limit in the following manner: 
suppose the variance (intra); obtained in the intra-block analysis for the difference 
between two varietal means such as 7; and v is greater than that for varietal 
means vs and 2; (intra). , then it follows that: 


(intra); 


(inter + intra), S (inter + intra). X 
(intra)e’ 





Using this relation, the upper limit for two varieties together in the same block 
in groups X or Y is 


(34 _12W = 
= 3W + W’ 
which gives a value slightly greater than the formula derived, as it should if it 
is to be the upper limit. In a similar manner one gets the variance for the differ- 
ence between varietal means not appearing together in the same block. 

5. Efficiency of the design to the randomized complete blocks. By the 
method outlined by Cochran [6] the efficiency can be shown to be measured by 
the ratio of 

a e. 

ea to 4 (average error variance of the difference between two plots). 

It will be noted, by using the above formula, that the gain in efficiency for 
the numerical problem given in [1] is 1.003, which for our purpose here is zero. 
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This, in general, will not be the case, for on most soils there is a block difference. 
In this particular test the ground used had been previously filled in with well 
mixed soil. The efficiency for the analysis given in [1] relative to the randomized 
complete blocks was less than 1.00. 

This paper and the previous one show what a long tedious procedure is neces- 
sary to analyze the data, when the design does not follow the rules for the 
construction of the lattice, triple lattice, etc. The complexity of these methods 
stresses the importance, to those designing experiments, of not deviating from 
the established design if the most information is to be secured from the data with 
simple calculations. 
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FURTHER REMARKS ON LINKAGE THEORY IN 
MENDELIAN HEREDITY 


By HiLtpa GEIRINGER 


Wheaton College 


In the following an explicit formula for the distribution of genotypes in case of 
three Mendelian characters will be given [formula (5)]. The complete discussion 
of the case m = 3 suggests a supplement (as stated in the last paragraph of this 
paper) to the general limit theorem dealing with m characters. 

In an earlier paper’ recurrence formulae have been derived which furnish the 
distribution of genotypes in the nth generation if the distribution in the (n — 1)th 
generation and the “linkage distribution” (l.d.) are known. It was also 
shown how to “integrate” this system of difference equations so as to determine 
the distribution in the nth generation directly from that in the Oth generation. 
This last method, though straightforward, requires however in each particular 
‘ase quite a few operations. 

In case m, the number of Mendelian characters, equals two, an explicit 
formula for the problem in question had been known. Denote by p(x, 22), 


1Hitpa GEIRINGER, Annals of Math. Stat. Vol. 15 (1944), pp.25-57. The notation 
in the present Note will be the same as in this paper. 





* 
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(%1,%2 = 1, 2, --- k), the “distribution of transmitted genes” in the original, Oth, 
generation, by p” (a1, #2) that in the nth generation and by c the “crossover 
probability” (c.p.). Then the simple formula holds:? 


(1) p” (x1 , 22) = (1 — c)"p(ar, 22) + [1 — (1 — ©) "Ipi(ar)pn(ar). 


This may also be written: 


(1’) p” (a , 2) = pr(x1)po(xe) + (1 — c)"[p(ai, x2) — pi(x1)pe(xe)], 


where p:(x;) are the marginal distributions derived from p(x, x2). (1’) shows 
that, if in case of independence of the original distribution, p(x , x2) = p1(21)pe(x2) 
then p (a, , 22) = p(ai, 22) for every n. The same is true for arbitrary p(2; , 22) 
ifc = 0. Otherwise, if c > 0 the second term to the right in (1’) tends towards 
zero as n — © and the well known limit theorem results. 

In case m = 3, a remarkably elegant explicit formula exists’ which may be 
deduced from the author’s general theory. In this case the l.d. is completely 
equivalent to the three ¢.p.’s Ciz , G3, C3. The c;; are probabilities with sum 
2, and for which the triangular relation 


(2) “cis + Cie 2 Cin 
holds. If l(e., & , €3) (e; = 0, 1) denotes the eight values of the 1.d. we have (see 


quot. [1], p. 32) (000) = 1(111), (100) = 1(011), (010) = (101), 7(001) = 1(110), 
hence three independent values only. We may introduce 


(3) 21(000) = v(000) = vw, 21(100) = v(100) = »,, 21(010) = v(010) = »w 
21(001) = v(001) = v3; wm tm+ m+ v3 = 1. 
It follows easily that 

(4) Cig = U5 + Vj» (i * ds 1,7 er 1, 2, 3). 


The original distribution p(2, , x2, 23) has marginal distributions p;;(2; , x), 
p:(x;). These values will be denoted briefly by pus, Pic, Pos, Pis, Pr,» P2, Ps 
respectively. Writing in an analogous way p™ (22203) = piz3. the new formula is 
the following: 

(5) piss = pipeps + [(v + v1)" — vo |(Pipos — Pip2ps) + [(vo + v2)" — vo |(poprs 
— pripxps) + [(vo + v3)" — vo ](pspi2 — Pip2Ps) + v0 (Pixs — Pip2ps). 


This useful formula permits to compute readily Piss for every n. In terms of the 


¢;;, Writing 
(6) d;3= 1 — ej, Vo = 1 — F(ci2 + C23 + C33), 
it reads 


(5’) Diss = Pipoxp3 + (dss — ) (pipes —_ PiP2Ps3) +-+- v9 (Pies = PiP2ps3). 


2H. S. JENNINGS, Genetics, Vol. 12 (1917) pp. 97-154. 


3 Professor Felix Bernstein called this author’s attention to the biologically interesting 
case m = 3. 
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In these formulae the role of independence of the original distribution is clearly 

n) . 
seen: If p:; = pip; and py3 = prpsps3 then pis; = Pys for every n and every l.d. 
The same holds for every n and every py3 if v) = 1, which implies that all c;; be 


zero. If in (5’) all d;; < 1, hence all c;; > 0 the limit theorem lim p{; = 


no 
Pipeps results. c;; > 0 means that complete linkage between any two genes is 
excluded. If, on the other hand, e.g. vp > 0, v1 > 0, v9 + v1 = dos = 1, 23 = 0, 
hence vo < 1, v2 = v3 = 0 we get a3 — Pipex. If 3 = ce = O the triangular 
relation (2) shows that ci; = 0 too, a case considered above. 

It should be noticed that (5) is, of course, in agreement with the author’s 
equation (41) in quot. [1]. It only has to be observed,—an obvious fact not 
mentioned in my earlier paper,—that in the former setup the sum of all the a” 
for every fixed m equals one. Thus for m = 3: 


(7) ai3s + 03,3 + a3i3 + 0312 + ats3 = 1, (for every n), 
and 
(8) ais = v9, a3 = (v0 + 1)” — vo = ds — v9. 


(n) 
a2,13 = (Yo + v2)"g- v0 = dis — v0. 

») 
a3.i2 = (Yo + v3)" — v9 = diz — Uo. 


The preceeding complete discussion of the case m = 3 suggests a remark 
concerning the general case of m characters. In my earlier paper the influence 
on the main limit theorem of certain ways of degeneration of the I.d. had not been 
explicitly considered. In the following we shall use the v-distribution which 
is a little shorter to write than the l.d. l(a, € , +++ €m). The v-distribution con- 
tains only 2”* values with sum one, defined in a way similar to (3). The main 
limit theorem ({1], theorem II, p. 42) states in our present notation that 
(9) lim pj2...m = Pip2***Dms 

n—*00 

if ‘“compléte linkage’? between any group of genes is excluded. That implies 
that not only vp = v(0, 0, --- 0) = 1 must be excluded but even 2;;...,(0, --- 0) = 
1, where this last probability denotes a marginal distribution of the v-distribution 
of an order 22. To assure this it is necessary and sufficient that nov; ;(0,0) = 1, 
or no d;;=v;;(0,0) = 1, ornoc;; = 0. Hence (9) holds if and only if noc;; = 0. 
If this condition is not satisfied the ].d. degenerates in various ways and the limit 
theorem is to be modified accordingly. If, in particular, vy) = 1, all c;; = 0, and 
Diz) ...m = Pw..-m for every n. 

Between these two extreme cases (‘‘no ¢;; = 0’’, ‘all c;; = 0’) are the different 
possibilities of r << m groups of completely linked-characters (see [1] p. 36, iv)). 
Consider e.g. m = 7 and vy234(0000) = 1, v:67(000) = 1 (this is realized if 
v(0000000) > 0, v(0000111) > 0 with sum of these two numbers equal to one) then 
lim Dis. -7 = P1234 Pos. Here the four characters 1, 2,3, 4act as one character and 


no 


, (n) “¢ . 
Disses = Psu forevery n. Also ps6r = Poser. Or if, for m = 6, dy = dys = dsp = 1 


(realized if v(000000) > 0, v(110000) > 0, v(001100) > 0, v(000011) > 0, with 
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the sum of these four values equal to one) then p{2?..6 — pipupss. If however 
form = 6 merely dy = dy = 1 (realized if, in a notation analogous to (3), v , vs, 
Ug , Vs6, V12, V34, V125, Vig are the only non-zero values of the 1.d.) then p{2”..6 > 
Pi2PssPsPe - 

In general, with a proof which consists in a modification of the reasoning (p. 
41), of my earlier paper, we may state the following complement to the main 
limit theorem (9): Jf the l.d. is such that r < m disjoint groups G, , G2 ,--- G, 
of completely linked characters exist, i.e. such that within each group no crossover 
takes place, each group containing as many of the m numbers as compatible with the 
definition but not less than two, and all groups together containing s S m of the m 
elements, then, asn — &, p{3...m converges towards the product of those marginal 
distributions (of the original generation) which correspond to these groups multiplied 
by the marginal distributions of order one of the remaining free elements which are not 
contained in any such group. Ina formula: 

(10) lim PG i.G-++GrretirYsto*¥m — Pa; Peo---Pa, Prs+1 Prst+2°+-Prm: 


n—->oa 


We may also characterize these linked groups of maximum size by stating that 
while within each group no crossover takes place there must be at least one c.p. ¥ 
0 among any two such groups and at least one among any group and any free 
element. It may however be noted that if there is one c.p. > 0 among two 
groups of complete linkage (or among a group and a free element) then all ¢.p.’s 
among these two groups are different from zero. In fact, it follows by repeated 
use of the triangular relation (2) that if one c.p. among two disjoint groups of 
complete linkage is zero, all of them are zero. If, e.g., (1, 2,3) and (5, §, 8) are two 
groups of complete linkage, i.e. v12;(000) = 1 and v5s3(000) = 1 and if besides 
C5 = 0, then v2356s(000000) = 1 and these six elements form a group of complete 
linkage. 

It may be noticed that the above statement of the generalized limit theorem 
becomes simpler and more elegant by counting “free elements” as groups. It 
might then run as follows: Jf Gi, G2, --- Git S m) are the maximal groups of 
completely linked characters, then, under the hypotheses of the earlier paper, the gene 
distribution in successive generations approaches a limit in which the original (mar- 
ginal) probabilities within each group G; are preserved and genes and sets of genes 
fromd ifferent groups are independently distributed. 





ON THE DEFINITION OF DISTANCE IN THE THEORY OF THE GENE 


By Hips GEIRINGER 
Wheaton College 


In several letters to this author Dr. I. M. H. Etherington of the University of 
Edinburgh has raised questions concerning the author’s definition of ‘‘distance”’ 
proposed in Section 10 of her paper on Mendelian heredity,’ comparing it with 





1 Annals of Math. Stat., Vol. 15 (1944), pp. 25-57. 
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the definition implicit in Professor J. B. S. Haldane’s earlier treatment.’ The 
main content of the author’s paper consists of some general limit theorems and 
the integration of a certain system of difference equations. The distance defini- 
tion is a by-product subject to discussion. 

“Distance” d;; between two genes 7 and j is defined by the author as the 
mathematical expectation of the number of crossovers in the interval (7, 7) with 
respect to the “linkage distribution” (l.d.). This basic concept is introduced 


as follows (page 32): If S is the set of numbers 1, 2, - -- m (m being the number 
of Mendelian characters), A any subset of A and A’ = S — A, we denote by 
(A) the probability that an individual with ‘‘maternal’” genes 21, -°-+ jvm 


and paternal genes y; , ---, ym transmit the paternal genes belonging to A and the 
maternal genes belonging to A’. These 2” probabilities constitute the 1.d. 
From these definitions the equality (G. (53’)) 


(1) diz = Cisna + Cigtige Fees H Cp (@ <9) 


is derived, where c;; is the probability of a “‘crossover” (e.p.) in (7,7). This 
distance has the required additivity: (G. (54)) 


(2) di; + d ix = dix ’ (z <3 < k). 


Etherington points out that the term “distance” has an established currency 
in genetics being the basis on which chromosome maps are constructed, and 
that there is a standard method of calculating it in accordance with which (1) 
is an ‘“‘approximation valid only when the adjacent ¢.p.’s are small.”? Moreover 
“the biological uniqueness has been lost for the value of d;; now depends on the 
particular set of intermediate genes which we happen to be considering. If any 
of them are omitted from consideration then the inequality (G. (13)). 

(3) Cit Cu 2 Cx 

shows that in general d;; is diminished while if new genes are taken into con- 
sideration d;; may increase.”’ “In order that d;; should not depend on a particu- 
lar choice of intermediate genes the word ‘crossover’ in the definition given would 
have to be interpreted as ‘chiasma’ instead of ‘odd number of chiasmata’; and 
then d;; cannot be evaluated in terms of the l.d. alone without further assump- 
tions regarding the interference of crossovers.” 

The point of view adopted in the author’s paper was to regard the /.d. as the 
basis from which everything-else has to be inferred. The number m of Men- 
delian characters is considered constant and the distance, being a mathematical 
expectation with respect to the l.d. necessarily depends on it. In this conception 
distance is not a geometric property which can be measured for any two genes 
independently but rather a system of m(m — 1)/2 consistent numbers associated 
to the m genes. There is no choice regarding the intermediate genes to be taken 
into consideration; all known genes are to be considered, i.e. one has to use the 
available relevant information in order to determine the I.d., the ¢.p.’s and the 











2 Quotation [4a] in the author’s paper. References to these papers will be distinguished 
by the initials H and G. 
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distances. If the information is incomplete the results will be provisional and 
subject to change; if it is satisfactory the same will be true for the distances. 
Thus it is nothing but natural that d;; is changed if some genes are omitted from 
consideration, or if new genes are discovered. In this set up ‘‘crossover”— 
defined by means of the marginal distributions of second order of the l.d.—means 
a transition from the paternal to the maternal set or vice versa. (Expressed 
in terms of the chiasma-hypothesis this means “odd number of chiasmata 
between adjacent genes.’’) Additional assumptions “regarding the interference 
of crossovers” are neither necessary nor admissible. All this is contained in the 
l.d. 

Haldane’s approach as translated by Etherington into the author’s notation 
is as follows. ‘The genes are considered to be distributed continuously along a 
chromosome. Thus this approach unlike G.’s is not based on the ld. of a 
finite set of genes. We must think of one suffix, 7, as referring to a gene at a 
fixed locus on the chromosome, the others to variable loci, so that the c.p.’s 
are variable. For any three genes 7, 7, k a quantity p is defined by the equation 


(4) Cik = Cig H Cie — PCi jC jx , @<j<hk), 


Biological considerations show that p is a number between 0 and 2 (small when 
c;; and c;;, are both small, increasing, on the whole, with c;; + ¢;,). The distance 
D;; is defined by the statement 


(5) Dx ;/cxj > 1 as k approaches j (cx; — 0), 


together with the additive property, and from this with (4) Haldane’s general 
distance expression is derived: 

“si de,; 
, a 
( ) d 0 1 aaa Doli; 
Here po = po(ci;) denotes the limiting form of p when k approaches j, and repre- 
sents biologically a property of the chromosome segment (7,7), a measure of 
interference. Any suitable specification of this function po(c;;) would constitute 
a mathematical ‘model’ of the chromosome. If p were constant we should 
have po = p and 


(7) iw > 7 08 (1 — pei). 


Both Haldane and Geiringer considered the special cases p = 2 (no interference) 
and p = 0 (complete interference) for which respectively 

(7’) D;; = = 3 log (1 —_ 3 Ci 5) 

(7”’) Dij = ci; = dij. 


Since p is always between 0 and 2 Haldane concludes that the true value of D;; 
is between (7’) and (7’’), and he gives reasons for saying that (7’) is nearly correct 
for genes ‘far apart,’ (7’’) for genes ‘close together.’ ”’ 
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If the author is right, this seems to be the standard definition accepted in 
genetics as mentioned above by Etherington. A few, not exhaustive, comments 
may be added. Writing in (6) ¢ for the variable of integration and py = po(t) 
it is seen that the expression 


fas 
(6) ee 

0 1— tpo(t) 
contains the unknown function po(t), which is unspecified except for the state- 
ment that it is bounded between 0 and 2. It is immediately seen that with an 
arbitrary po(t) and without a restriction taking the place of (4) this distance (6) 
will not be additive in the sense of (2). By imposing, after a choice of po(¢), 
appropriate restrictions on the c;; additivity may be achieved. For instance in 
the particular case p,(t) = p = const, (2) holds by virtue of (4). For such a set 
of restrictions it has then to be proved that the corresponding ‘‘model”’ is ‘‘con- 
sistent,” i.e. that the so restricted ¢.p.’s form a compatible set of marginal 
distributions of second order of an m-variate distribution, the 1.d. 

These different points will be exemplified presently by studying the particular 
case p(t) = p, where p is a suitably chosen constant; the parameter p is to be 
fitted to the observations under consideration. It may be impossible to repro- 
duce a set of observations satisfactorily if one parameter only is available. In 
fact, Haldane’s paper suggests that it is not only the particular case p = const 
he has in mind. It seems however that if D;; is given by (6) with a non constant 
po(t), complicated and perhaps (biologically) not very meaningful conditions may 
have to be introduced in order to assure additivity of the distances and con- 
sistency of the respective model. This author was unable to work out examples 
of more general and at the same time appropriate and fairly simple assumptions 
for the unknown function pp(t). 

If p = const, then (7) under the restriction (4) furnishes an additive distance 
definition because: 


— plD;; a D jx| = log (1 — DC; ;) + log (l — PC jx) 

= log (1 — pei; — pe + preijcjx) = log (1 — pew) = — pDix, 
because of (4). Let us now investigate whether there is a consistent system of 
¢.p.’s satisfying (4). Put, as in G.(48), ¢;,:41 = pi , combine (4) with G.(50) and 
write p = 2e. It follows that (4) is satisfied with 0 S e < 1, if: 


(8) Pij = €PiPj, Dijk = EPPiPr, °° 


Here p;; is the probability of the simultaneous occurrence of the ‘events’ 
numbered 7 and j, etc. For e = 0 we get “disjoint events” (see G.z) for the 
discussion of consistency). Assume now e > 0. By some considerations, 
analogous to those p. 54 G, the following necessary and sufficient condition of 
consistency follows: 


m—l 


)) H(i-m) Zi-e« (e > 0). 


t=1 
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This restriction (not considered by Haldane or Etherington) is, of course, 
relevant. Ife.g.m = 3, pi = po = 4/5, then e must be = 15/16; or if m = 4, 
~Pi = Po = Ps = 3,€23— 4/5 results. The restriction required by the “linear 
theory”’ is 


(10) mee, (¢ = 1,2,---,m—1). 


Hence this model is consistent under certain restrictions. It is, in contrast 
to Etherington’s contention, different from iii) G. p. 54. The corresponding 
distance definition (7) is different from the author’s. The D;,; thus defined are 
additive, and D;; depends on c;; only and not on the intermediate genes. The 
author’s definition of distances, d;; , is general, additive and seems to the author 
to be well adapted to the biological situation; since the definition of d;; is not 
related to any particular model it is compatible with any model, which may 
contain any desired—consistent—assumptions about “interference,” etc. For 
example in G. iv) p. 55, an n-parametric model has been suggested which seems 
fairly flexible. 

It may however seem more acceptable to the biologist not to use a general 
distance definition but to define ‘‘distance” merely in relation to some sufficiently 
general “‘model’’ (such that the distance definition would vary with the model), 
instead of accepting an all-over definition as ventured in the author’s paper. 
The particular model (8) in connection with its related distance definition (7) 
might give an example of such an approach.” * 


’ As Etherington remarks, eq. (14’) in the author’s original paper is not correct. One 
can only state that (47) holds. The mistake is however without consequence since no 
conclusions are drawn from (14’). The same mistake was pointed out by Professor Kai 
Lai Chung. 

4 Etherington writes: ‘‘I have been kindly allowed to read Professor Geiringer’s MS. 
and feel that some comments are necessary. 

The standard procedure for calculating the distance between two linked genes is as 
follows. A selection of intermediate genes is taken and the adjacent crossover values 
calculated, giving a provisional estimate of the distance as in Geiringer’s formula (1). 
When further intermediate genes are added to the selection, it is found that the provisional 
distance increases, but there is apparently a maximum value beyond which it cannot be 
increased. This unknown maximum value is the distance, and the geneticist accepts (1) 
as the distance when he is sure that he has observed a sufficient number of intermediate 
genes to give a good enough approximation to the true distance. Thus Geiringer’s formula 
(1) gives the geneticist’s true distance only on the understanding that it includes all genes 
intermediate between 7 and j; but generally speaking the great majority of these genes 
may be unobservable in the sense that they have no observably distinct alleles by means 
of which the c.p.’s could be calculated, though from time to time fresh genes may become 
observable by mutation. 

In some cases the above procedure fails because not enough intermediate genes can be 
observed; then Haldane’s analysis is useful. It should be emphasized that his distance is 
additive by definition. (For a geometrical analogy, think of the genes as points closely 
distributed along a curve, chords representing c.p.’s. Haldane’s definition of the distance 
is analogous to defining arc length of the curve as a limiting sum of chords.) In my tran- 
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scription of his treatment, I should perhaps have made it clearer that the derived formula 
(6) gives only the distance D;; measured from the initially chosen and fixed gene 7 to an 
arbitrary gene j7. Other distances Dj: , (i < j < k), are deduced from it by the postulate 
of additivity (Dj, = Di, — D;;). If the origin 7 is changed, there will be a similar formula 
(6), but it should not be assumed that the function po is the same. In referring to certain 
conditions necessary ‘to assure additivity,’ Geiringer evidently means conditions that the 
function po may be the same for all origins 7. These conditions would be interpreted bio- 
logically as asserting uniformity of interference along the chromosome. I agree that there 
are further points to be cleared up in this connection. 

If I might sum up the discussion, I would say that the geneticist’s conception of the 
distance between genes is an actual property of the corresponding chromosome segment. 
Geiringer’s definition represents the best possible general approach to this from the limited 
data of the l.d. alone. Haldane’s definition fits the geneticist’s conception, and his in- 
vestigation is an attempt to get the best estimate of the distance by making approximate 
assumptions as to what happens between the observed genes. It is based on the unob- 
servable crossover-distribution of a supposed infinite set of genes, but can be applied to 
particular models of this infinite ¢.d. so as to derive results which involve only a finite and 
observable c.d. Finally it should be mentioned that in the paper quoted, Haldane gave 
also an alternative method for the case p = 2, leading to the same formula (7’), which is 
really equivalent to defining the distance as the mathematical expectation of the number of 
chiasmata (not crossovers in G.’s sense) in the interval (7, 7).’’ 





A CRITERION OF CONVERGENCE FOR THE CLASSICAL ITERATIVE 
METHOD OF SOLVING LINEAR SIMULTANEOUS EQUATIONS 


By Cuirrorp I. Berry 


Consolidated Engineering Corporation, Pasadena, Calif. 


The recent development of two devices’ * for solving linear simultaneous 
equations by means of the classical iterative method’ has stimulated the writer 
to investigate convergence criteria for the method. There are in the literature‘ 
necessary and sufficient criteria for convergence of symmetric systems, and suf- 
ficiency criteria for general systems. So far as the writer knows, however, this 
is the first development of a necessary and sufficient criterion for convergence 
in the general case. The results obtained are applicable to any arbitrary square 
non-singular matrix in which a;; # 0. 

Let the set of equations be represented by 


(1) - AX =G, 





1 Morgan, T. D., Crawford, F. W., ““Time-saving computing instruments designed 
for spectroscopic analysis’, The Oil and Gas Journal, August 26 (1944), pp. 100-105. 

2 Berry, C. E., Wilcox, D. E., Rock, S. M., Washburn, H. W., “‘A computer for solv- 
ing linear simultaneous equations’’, to be published. 

3 Hotelling, Harold, ‘‘“Some new methods in matrix caleulation’’, The Annals of Math- 
ematical Statistics, Vol. XIV (1943), pp. 1-34. 

4 Mises, R. von and Pollaczek-Geiringer, Hilda, ‘‘Zusammenfassende Berichte. Prak- 
tische Verfahren der Gleichungsauflésung’’. Zeitschrift fiir angewandte Math. und Me- 
chanik, Vol. 9 (1929), pp. 58-77, and 152-164. 
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in which A is the square matrix of the coefficients, X is the column matrix of the 
unknowns, and G is the column matrix of the constant terms. | A | is the de- 
terminant of A. 

We define a matrix A, which contains the prediagonal and diagonal terms of A, 
and a matrix A, which contains the postdiagonal terms of A. According to this 
definition, 


(2) Ay — As = A. 


In the classical iterative method, arbitrary (or approximate) values of the z’s 
are chosen, the first equation is solved for the first unknown, the second equation 
for the second unknown, etc., using in each equation the most recent approxima- 
tions to the z’s. This process may be written 


(3) AiX® 4+ AX = G, 


in which X is the initial approximation matrix, and X“? is the approximation 
matrix existing at the end of the first iterative cycle. The superscripts indicate 
the number of the approximation. The next cycle is described by 


(4) ° A,X™ + A,X™ = G, 
and the mth by 
(5) Ax” 4. Axe = G. 


The method yields a solution, i.e., converges, if 


lim (xX” — X) =0. 


mo 


Solving (5) explicitly for X°”, 


(6) X™ = Aj'G — Ay'A,xX™”. 
Subtracting X from each side, 

(7) xX™ — X = Aj'G — Ay'A.X"” — X, 
and making use of (1) and (2) 

(8) x™ — X = —Aj'A(X”™” — X). 
Since (8) applies for any value of m, we may write 

(9) x — X = (—Aj"A,)(X" — X), 
and continuing this process, 

(10) x@ — X w= (-A47/AD"ZX” — X). 
Now, lim (x — X) = Oif and only if 


(11) lim (—Aj' A2)” = 0. 


mo 
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This is a general result, applicable to any arrangement of the terms of an ar- 
bitrary square matrix A, subject only to the conditions that | A | ¥ 0 and that 
no diagonal term of A is zero. In this latter exceptional case, the iterative 
method itself obviously cannot be applied. 

The criterion (11) clearly shows that the order in which the elements of the 
matrix A are arranged is important. For instance, it is plain that an arrange- 
ment in which the diagonal terms are large and the off-diagonal terms, particu- 
larly the post-diagonal terms, are small will tend to favor convergence. 

A somewhat relaxed condition, which is sufficient but not necessary, is ob- 
tained through the use of an inequality used by Hotelling’, namely, 


(12) N(B") < [N(B)I", 


in which N(B) is the norm of the matrix B, that is, the square root of the sum 
of the products of its elements by their complex conjugates, or in the case of a 
real matrix the square root of the sum of the squares of the elements. 

The condition is that, if 


(13) N(Ajz'A2) <1, 
then 
(14) lim (Ay' A)” = 0. 


Criterion (13) is readily computed, since Aj’, the reciprocal of a triangular 
matrix is readily computed, and the post-multiplication by A, involves a number 
of zero terms. 

A more stringent condition than (13) though still not a necessary condition, 
is that if some finite number p can be found such that 


(15) N(Aj'A2)? <1, 


then (14) follows. Since n matrix squarings result in a value of p = 2", the size 
of the norm for fairly large values of p can be investigated without excessive 
labor. 


A REMARK ON INDEPENDENCE OF LINEAR AND QUADRATIC 
FORMS INVOLVING INDEPENDENT GAUSSIAN VARIABLES 


_ By M. Kac 


Cornell University 


The purpose of this note is to call attention to the following useful theorem, 
which to the best of my knowledge was never stated explicitly. 

If Xi, Xe, Xs, --- Xn are identically distributed, independent Gaussian random 
variables each having mean 0, then the necessary and sufficient condition that 


n 


Zz On Xi Xx and 7 a;X; =a-X 
j=1 


7.k=1 
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be independent, is that 
Aa = O, 
where A is the matrix of the quadratic form, a the vector (a, a2, +--+ , an) and X the 
vector (Xi, Xo, cas maki 
1 \* . e ° 

PROOF OF SUFFICIENCY. Since Aa = OQ, it follows that 0is an eigenvalue of A, 
and a is a corresponding eigenvector. 

Denoting by A2,--:, A, the remaining eigenvalues and by f:,--- , 6, the 
corresponding eigenvectors, we have 


De aj. X;X_ = 2, (8; X)’. 


) ,k=1 


Since the ’s are orthogonal to a, it follows that the linear combinations 8 ;- X 
are independent of a-X, and this completes the proof. 
PROOF OF NECESSITY. From the assumption of independence it follows that 


n n 2 n 
7 jn Xi Xx and (> aX) = 7 aja, X;X> 


7, k=1 7=1 7.k=1 


2 


are independent. Thus by Craig’s theorem 
AB =O 


where B = ((a,ax)). 

This implies almost immediately that Aa = O. 

1 Added in proof: Dr. L. Guttman has kindly pointed out to me that the proof of 
sufficiency given here has been used by D. Jackson in the article ‘Mathematical principles 
in the theory of small samples’’, Amer. Math. Month., Vol. 42 (1935), pp. 344-364, see in 
particular pp. 354-355. Jackson considers only the independence of % and s?, which is of 
crucial importance in deriving student’s distribution. 

2A.T. Craic, Annals of Math. Stat., Vol. 14 (1943), pp. 195-197; see also H. Hotet- 
LING, ibid., Vol. 15 (1944), pp. 427-429. 
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1. On The Variance of a Random Set in n Dimensions. HERBERT Rossing, 
Lieutenant USNR Postgraduate School, Annapolis, Md. 


Using a general formula for the moments of the measure of a random set X (Ann. Math. 
Stat. Vol. XV (1944), pp. 70-74) we find the mean and variance in the case where X isa 
random sum of n-dimensional intervals with sides parallel to the coordinate axes, thus gen- 
eralizing the results previously found (loc. cit.) for the case n = 1. 


2. The Non-Central Wishart Distribution and its Application to Problems in 
Multivariate Statistics. T. W. ANprrRson, Princeton University. 


The non-central Wishart distribution is the joint distribution of sums of squares and 
cross-products of deviations of observations from multivariate normal distributions with 
identical variance-covariance matrices and with different sets of means. The rank of the 
non-central Wishart distribution is defined as the rank of the matrix of sets of means. Ina 
previous paper (by M. A. Girschick and the present author) the non-central Wishart dis- 
tribution is given explicitly for the rank one and two cases and indicated for the case of any 
rank. Inthe present paper the characteristic function of the non-central Wishart distribu- 
tion is given for general rank. The distribution, which is given in the form of a multiple 
integral, is the product of a central Wishart distribution and a symmetric function of the 
roots of a determinantal equation involving the matrix of squares and cross products of 
observations and the matrix of population means. It is shown that the convolution of two 
non-central Wishart distributions is again a non-central Wishart distribution if the vari- 
ance-covariance matrices are the same. The moments of the generalized variance and the 
moments of the likelihood ratio criterion for testing certain linear hypotheses (for example, 
the hypothesis that the means of a set of populations are identical, given that the matrices 
of population variances and covariances are the same) are obtained for the linear and planar 
non-central cases in terms of infinite series. Likelihood ratio criteria are developed for 
testing the dimensionality of the means of a set of multivariate populations (with identical 
variances and covariances) on the basis of one sample from each. The criterion for testing 
whether the dimensionality is h in the space of p dimensions is a symmetric function of p—h 
smallest roots of the determinantal equation involving the sample estimate of the matrix 
of variances and covariances and the sums of squares and cross-products of deviations of 
sample means. The maximum likelihood estimate of the hyperplanes and positions of 
means on them are obtained. The asymptotic distributions of the criteria are x?- 
distributions. 


3. The Effect on a Distribution Function of Small Changes in the Population 
Function. Burton H. Camp, Wesleyan University. 


It is generally assumed in the application of distribution theory that, if the actual popu- 
lation function is not very different from the one used in the theory, then the true sampling 
distribution of a statistic will not be very different from the one obtained in the theory. 
But elsewhere in mathematics we do not assert that a conclusion will be only slightly modi- 
fied by a small deviation in the hypothesis. This paper presents some theorems which are 
useful in determining the maximum effect on a sampling distribution of certain kinds of 
small changes in the population function. 
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4. Composite Distributions. Casper GorrMAN and BENJAMIN EpPsTEIN, West- 
inghouse Electric Corporation. 


Let f(z; 61 , 02, --- ,0,) bea function such that for evéry point 6; = 6:0 , «++ , On = Ono in 
parameter space, z is a random variable with p.d.f. f(z; 610, --+ , 9n0). Suppose further 
that the parameters 6; , 62, °°: , 6, are themselves random variables whose p.d.f.’s are 
given respectively by $(61) , --- ¢(@n). Using a concept of ‘‘probability contained in an 
interval’ and an axiom based on this concept, we show that z is a random variable with 
p.d.f. g(x) given by the formula 


(1) g(x) -/[ vf fas 01-5 On) (81) «++ (On) d0y «++ dO . 


In this paper we consider statistical properties of the function g(x) in cases of particular 
interest in applications. The cases treated here are (a) where the mean, #, is the only vari- 
able parameter, (b) where the standard deviation, o, is the only variable parameter, and 
(c) where the mean 7, and the standard deviation, o, are both variable parameters; Z and o 
being independent. 

It is shown that problems (a) and (b) are equivalent respectively to the sum and product. 
of two independent random variables, one of which has zero mean. Formulae for the 
moments in problem (c) are then derived in terms of the formulae obtained for (a) and (b). 


5. Population, Expected Values and Sample. E. J. Gumpre., New School for 
Social Research. 


Let x be an unlimited continuous variate, and let F(x) be the probability of a value equal 
to, or less than, x. Then the expected m™ values Z,, , for n observations, are approxima- 
tions to the most probable m** values and defined by F(4m) = Fi + (Fn — Fi) (m — 1)/ 
(n — 1), where F, and F, are the probabilities of the most probable first and the most prob- 
able last value. The probabilities F; ,1— F, and (F, — F;)/(n — 1) are of the order of 
magnitude 1/n. 

The distribution of the expected values Z,, differs from the distribution of the sample 
and from the theoretical distribution. However, for a symmetrical distribution the mean 
and the odd moments about mean calculated from the expected values coincide with the 
mean and the moments of the population. For the normal distribution, the expected 
standard deviation o(n) divided by the standard deviation o of the population and traced 
on normal probability paper approximates a linear function of ~/log n. The approach of 
a(n) toward o is slow. For 500 observations, o(n) is about 99% of ¢«. The moments of the 
distribution of the expected values exist even in the case that the moments of the theoretical 
distribution diverge. 





6. On Optimum Estimates for Stratified Samples. Morris H. HANsEN and 
Wi.uiaM N. Hurwitz, Bureau of the Census. 


A stratified sample is drawn from a population with R strata. Neyman found the op- 
timum sample allocation for the ‘‘best unbiased linear estimate.’’ However, biased but 


‘ ‘ ‘ x; , , . 
consistent estimates of the form — where both z; and y; are random variables have been 
Uy: 
t 
found to give more reliable results in a large class of problems. Even more efficient esti- 
mates can be obtained by finding the values of n; (the sample size) and w; which minimize 
. _ & rw; 2; 
the mean square error of estimates of the form 2w; — or ——,. 
yi Dw Yi 
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7. Pearsonian Correlation Coefficients Associated with Least Squares Theory. 
Pau. S. Dwyer, University of Michigan. (Read by Title). 


In least squares theory we have the predicting variable x, the observed value of the 
predicted variable, y, the residual e, and the predicted value of the predicted variable y. 
The purpose of this paper is to study the Pearsonian coefficients resulting from correlating 
all these variables in pairs (a) in the case of a single predicted variable and (b) in the case 
of two or more predicted variables. The results yield such coefficients as multiple correla- 
tion, multiple alienation, partial correlation, part correlation, and new coefficients not 
previously in use. The results are given in expanded, determinant, and matrix form. A 
simplified calculational technique is provided. 


NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute new items of interest 


Personal Items 


Dr. Kenneth Arnold, recently with the Statistical Research Group, Columbia 
University, has accepted an assistant professorship in Mathematics at the Uni- 
versity of Wisconsin. 

Dr. Leo Aroian has returned to his position at Hunter College after serving as 
Research Associate in the Applied Mathematics Panel Project at the University 
of California. 

Mr. Geoffry Beall is now statistician for the Institute of Paper Chemistry at 
Appleton, Wisconsin. 

Mr. Robert E. Breden has accepted a position with the Personnel Research 
Department of Proctor and Gamble at Cincinnati. 

Mr. William F. Elkin, who has been Social Science Analyst with the Vital 
Statistics Division of the Bureau of the Census, has accepted a position as Vital 
Statistician at Oak Ridge, Tenn. 

Mr. Robert M. Ewing of the U. 8. Rubber Company has been transferred to 
Detroit. He now serves in the capacity of Tire Development Engineer. 

Dr. A. 8. Householder, formerly of the University of Chicago, is now with the 
Fire Control Division of the Naval Research Laboratory in Washington. 

Dr. Irving Kaplansky has been appointed to an assistant professorship of 
mathematics at the University of Chicago. 

Mr. Amrom H. Katz has been promoted from Associate Physicist to Physicist 
at the Aerial Photographic Laboratory at Wright Field. 

Dr. William G. Madow of the Bureau of the Census will serve as Visiting 
Professor of Statistics at the University of Sdo Paulo, Brazil, for the full academic 
year which begins on March 16. He expects to return to the United States in 
January of 1947. 

Dr. J. E. Morton, formerly of Knox College, has joined the staff of the National 
Bureau of Economic Research. 

Dr. A. C. Olshen has returned from his navy work in Washington to his position 
as Actuary and Chief Examiner of the Oregon Insurance Department at Salem, 
Oregon. 

Mr. Joseph 8. Rhodes (formerly Joseph Rosenthal) now holds the position of 
Sampling Specialist in the Bureau of the Census. 

Prof. Paul R. Rider, on leave from Washington University, is teaching at 
Shrivenham American University in England. . 

Dr. J. Wolfowitz has accepted an associate professorship in Statistics at North 
Carolina State College. Professor Wolfowitz is serving as Associate Editor of 
the Journal of the American Statistical Association. 
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New Members 
The following persons have been elected to membership in the Institute: 


Astrachan, Asso.»Prof. Max, Ph.D. (Brown) Antioch College, Yellow Springs, Ohio. 

Bales, R. P., B.A. (Toronto) Tech. Sup., Dominion Rubber Co., St. Jerome, Que. Can. 

Bafios, Olegario Fernandez, D.C. (Madrid) Catedratico, Univ. of Madrid, Calle Lopez 
de Hoyos 7, Spain. 

Barkan, Herbert, M.A. (Columbia) Econ. Analyst, 50 8th Ave., Brooklyn, N. Y. 

Bloom, Royal F., M.A. (Minnesota) Lt. Comdr. USNR, Test and Res. Section, Bureau of 
Naval Personnel, 61 G. Ridge Road, Greenbelt, Md. 

Blommers, Paul J., Ph.D. (Iowa) Univ. Examinar and Registrar, 114 Univ. Hall, State 
Univ. of Iowa, Iowa City, Iowa. 

Brier, Glenn, A.M. (George Washington) Meteorologist, US Weather Bureau, Washington 
25, D.C. 

Brixey, Nancy, B.A. (Vassar) Economists, Davis and Gilbert Law Firm, 1 E. 44 St., 70 
East 77 St. New York 21, N.Y. 

Caplan, Benjamin, Ph.D. (Chicago) Econ., OPA, 2831 28th St. N.W., Washington 8, D. C. 

Chassan, Jack, B.S. (C.C.N.Y.) Stat., Office of Stat. Control, Hdq. A.A.F. 3013 30th St. 
S.E., Washington, D. C. 

Cornell, Dr. F.G., Ph.D. (Columbia) U.S. Office of Ed., Tempo. M. 26th and Water, N.W., 
Washington, D. C. 

Cramér, Prof. Harald, Ph.D. (Stockholm) Skirviksvigen 7, Dursholm, Sweden. 

Dempsey, William B., Ph.D. (Harvard) Regent of the School of Commerce and Finance, 
Saint Louis Univ., 3674 Lindell Blud., St. Louis 8, Mo. 

Derrick, Asst. Prof. Lucile, M.A. (Peabody) Univ. of Chicago, School of Business, 5542 
Kimbark, Chicago 37, Ill. 

Dominguez, Emilia A., Ec.S. (Buenos Aires) Actuary, Supt. Personas Juridicas de Buenos 
Aires, Martinez Castro 765, Buenos Aires, Argentina. 

Dominguez, Jose F., Ec. 8. (Buenos Aires) Tech. Council Instituto Nacional de Prevision 
Social, Martinez Castro 765, Buenos Aires, Argentina 

Duncan, Asst. Prof. Acheson, Ph.D. (Princeton) Econ. Dept., Princeton Univ., Princeton, 
N.J. 

Dyson, John D., B.S. (South Dakota State) Major, U.S. Army, Fitzsimons Gen. Hosp., 
Denver, 108 S. Jefferson, Pierre, So. Dak. 

Elmore, Francis B., B.S. (Clemson) Capt., Ord. Dept. Inspection of Ammunition, 505 
Kingston Drive, St. Louis 23, Mo. ; 

Franzen, Raymond, Ph.D. (Columbia) Stat. Consultant, 10 Rockefeller Plaza, New York 
20, IN. Y. 

Friedman, Bernard, Ph.D. (Mass. Inst. Tech.) Res. Math., A.M.P., N.Y.U., 3741 81 St. 
Jackson Heights, N. Y. 

Gordon, J. J., Staff Stat. Eng. Quality Control, Western Electric Company, Inc., 100 Central 
Ave., Kearny, New Jersey. 

Gough, Elsie L., M.A. (Michigan) - Auditing Clerk, 648 Blvd. Way, Oakland 10, Calif. 

Greene, Kenneth E., B.S. (Yale) Asst. Res. Mgr., Nat. Broadcasting, 4784 Post Road, 
Pelham Manor 65, N. Y. 

Haskins, Asso. Prof. Elmer E., Ph.D. (Boston) Northeastern Univ., Boston, 53 Damien 
Rd., Wellesley Hills 82, Mass. 

Humes, Helen M., M.A. (Pittsburgh) Price Econ., Bureau of Labor Stat., U. S. Dept. of 
Labor, 3703 34th St., N.W., Washington 8, D.C. 

Jackson, Irwin E., M.A. Mec. Eng. (Pennsylvania) Lt., Cadet Ground School Inst., Box 
153, Tuskegee Army Air Field, Tuskegee, Ala. 

Jarrett, Rheem F., B.A. (Arizona) Lecturer in Psych., Dept. Psych., Univ. of Calif.. 

Berkeley 4, Calii. 
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Johnsen, Madeline, A.M. (Stanford) 2449 14th Ave., San Francisco, Calif. 

King, Frederick G., B.A. (Harvard) Capt. U. 8. Army, 1629 Que St., N.W., Washington, 
Dp: ©. . 

Leipnik, Roy B., B.S. (Chicago) Res. Asst., Cowles Comm. for Res. in Economics, 5527 S. 
Kenwood, Chicago 87, Ill. 

Lesser, Grace L., B.A. (Hunter) Asst. Math., Applied Math. Group, Columbia Univ., 
1576 Unionport Rd., Bronx 62, N.Y. 

deLoor, Prof. Barend, Ph.D. (Amsterdam) Univ. of Pretoria, Pretoria Union of South 
Africa. 

MacNeish, Harris F., Ph.D. (Chicago) Chairman Math. Dept., Brooklyn College, Bedford 
Ave. & Ave. H, Brooklyn, N. Y. 

Maddrill, James D., Ph.D. (California) Math. Res. and Dev. Ballistic Res. Lab., Aber- 
deen Proving Ground, Md. 

Madow, Lilian H., M.A. (American) 1445 Ogden Street, N.W., Washington 10, D.C. 

Marrian, Dixon M., M.A. (Columbia) Instr. Math., U.S. Military Academy, West Point, 
i a a 

Martin, Charles C., B.S. (U.S. Military Academy) Lt., Ordnance Dept. U. 8. Army, Box 
363, Hot Springs, New Mexico. 

Monderer, Phyllis, B.A. (Hunter) Asst. Math., Applied Math. Group, Columbia Univ. 
Div. War Res., New York, N. Y., 529 West 170 Street, New York 33, N. Y. 

Moore, Margaret W., B.A. (Wilson) Stat., P-3, War Dept., LeHerKenny Ordnance Depot, 
Chambersburg, Pa., 304 Lincoln Way West, Chambersburg, Pa. 

Pope, Otis, Ph.D. (Iowa State) Senior Biometrician, USDA, Tech. Collaboration Branch, 
Washington, D. C. 

Priestley, Alice E., M.A. (New York) Instr. Stat. and Math., Wilson College, Chambers- 
burg, Pa. 

Rafferty, J. Allan, B.S. (Harvard) Medical Student, Pfe., ASTP (AUS) Box 236, Rochester 
Med. School, Rochester 7, N. Y. 

Randall, Robert J., B.S. (Yale) Lt., Post Weight and Balance Officer, Tuskegee Army Air 
Field, Tuskegee, Ala. 

Reiner, Mae, B.A. (Hunter) Asst. Math., Applied Math. Group, Columbia Univ., Dic. 
of War Res., 170 Second Avenue, New York 3, N.Y. 

Rodal, Prof. Juan A., Ph.D. (Buenos Aires) Univ. of Buenos Aires, Aviles 3755, Buenos 
Aires, Argentina. 

Rubin, Herman, 8.M. (Chicago) 7142 East End Ave., Chicago 49, Ill. 

Schmalz, W. H., B.A. (Toronto) Tech. Supt., Merchants Factory Dominion Rubber Co., 
Kitchener, Ont., Canada. 

Simmons, Willard R., M.A. (Duke) Head of Stat. Section, Food and Automotive Ration- 
ing, Div., OPA, 1420 Saratoga Ave., N.E., Washington, D. C. 

Sobel, Milton, B.S. (C.C.N.Y.) 38 Elliot Place, The Bronz, N. Y. 
Stauber, B. R., M.A. (Minnesota) Chief, Relocation Planning Div., War Relocation 
Authority, U. S. Dept. of the Interior, 9701 Bexhill Drive, Kensington, Maryland. 
Steen, Jerome R., B.S. (Wisconsin) Mgr., Quality Control Eng., Sylvania Electric Prod- 
ucts, Inc., Emporium, Pa. 

Sullivan, John W., Sc.D. (Mass. Inst. Tech.) Metallurgist, American Iron and Steel In- 
stitute, 350 Fifth Ave., New York 1, N. Y. 

Trowbridge, Frederick, Quality Control Eng., Sentinel Radio Corp., 2020 Ridge Ave., Evan- 
ston, Ill. 

Weck, Frank A., B.A. (Stanford) Capt., MAC, AUS, Chief, Stat. Analysis Branch, Med- 
ical Stat. Div., Office of the Surgeon General, 1818 H St., N.W., Washington 25, D. C. 

Weiss, Samuel, M.A. (Michigan) Chief, Manpower Estimates Section, War Manpower 
Comm., 3073 S. Buchanan, Arlington, Virginia. 

Wold, Prof. Herman O., Ph.D. (Stockholm) Univ. of Uppsala, Stat. Inst., Odinslund 2, 
Uppsala, Sweden. 
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Announcement of the St. Louis Meeting of the Institute 


The Institute of Mathematical Statistics will hold a joint meeting with 
Section A (Mathematics) of the American Association for the Advancement of 
Science on Saturday, March 30 at 2 P.M. in St. Louis. All the details are 
not yet available but the session will feature (1) contributed papers on Statis- 
tics and Probability, (2) an address by Lt. Commander John H. Curtiss on the 
topic Statistical Inference and its Engineering Applications, and (3) an address 
by Mr. Morris H. Hansen on Sampling Problems in Surveys of Business and 
Population. 


Meeting of Washington Chapter 


A joint regional meeting of the Washington Chapter of the Institute and the 
Washington Chapter of the American Statistical Association is being planned 
for April 12-13, 1946. 





MEMBERS OF THE INSTITUTE OF MATHEMATICAL 
STATISTICS* 


(As of November 15, 1945) 


(The names of Fellows of the Institute are designated by * and Life Life Members byt) 


Abbey, Helen M.A. (Michigan) Stat., Bur. of Records and Stat., Mich. Dept. of Health, 
916 N. Chestnut, Lansing, Mich. 

Acerboni, Prof. Argentino V. Dr. Ec. (Buenos Aires) Facultad de C. Economicas, Buenos 
Aires, Argentina, Larroque 232, Banfield, Argentina 

Acton, Forman Ch.E. (Princeton) T/4, Army of US, Corps of Engineers, S.L.D., Bar- 
racks Area, Oak Ridge, Tenn. 
Aitchison, Beatrice Ph.D. (Johns Hopkins) Econ. and Stat. Analyst, Interst. Com- 
merce Comm., Washington 25, D. C. 1929 S St., N.W., Washington 9, D.C. 
Allen, Prof. Roy G. D.Sc. (London) London School of Econ., Houghton St., Aldwych, 
London, W.C. 2. 

Allendoerfer, Asso. Prof. Carl B. Ph.D. (Princeton) Haverford College, Haverford, Pa. 

Alt, Franz L. Ph.D. US Army, 271 Fort Washington Ave., New York City 32 

Alter, Dinsmore Ph.D. (California) Res. Asso. in Math. Theory of Stat., Calif. Inst. of 
Tech., Dir. Griffith Observatory, Los Angeles, Calif., Col. T.C., US Army, 211 Pier 2, 
Brooklyn Army Base, Brooklyn, N.Y. 

Anderson, Paul H. Ph.D. (Illinois) Econ. Analyst, Office of Surplus Property, Dept. of 
Commerce, Washington, D.C. 1228 Blair Mill Rd., Silver Spring, Md. 

Anderson, Asso. Prof. Richard L. Ph.D. (Iowa State) Res. Math., Inst. of Stat., N.C. 
State College, Raleigh, N. C. 

Anderson, Theodore W., Jr. Ph.D. (Princeton) Res. Math., Cowles Commission for 
Res. in Econ., Univ. of Chicago, Chicago 37, Ill. 

Andrews, Asst. Prof. T. Gaylord Ph.D. (Nebraska) Univ. of Chicago, Chicago, IIl. 

Angell, Dorothy T. Stat. Analyst, Bell Tel. Labs., Murray Hill, N. J. 

Arias, B., Jorge C.E. (Guatemala) 3 Avenida Sur 65, Guatemala City, Guatemala, 
Central America 

Arnold, Asso. Prof. Herbert E. Ph.D. (Yale) Wesleyan Univ., Middletown, Conn. 

Arnold, Asst. Prof. Kenneth J. Ph.D. (Mass. Inst. Tech.) Univ. of Wisconsin, Madison 
6, Wis. North Hall 

Aroian, Leo A. Ph.D. (Michigan) Instr. Hunter Coll., New York City. 247 Wadsworth 
Ave., New York City 33 

Arrow, Kenneth J. M.A. (Columbia) Lydig Fellow, Columbia Univ., 116th St. and 
Broadway, New York City, Capt. AC, Hq. AAF, Weather Service, Asheville, N. C. 
218 South French Broad Avenue, Asheville 


* Members were asked to supply fresh information for this Directory. Records may be 
inexact or incomplete (1) because of the failure of some member to comply with this request, 
(2) because the directory card became obsolete as a result of an unreported change of address, 
(3) because information about position did not accompany a notice of change of address, or 
(4) because it is impossible to give all the information about men on leave in the standard 
form of ‘‘position,’”’ ‘“‘address,’’ and (in italics) ‘‘home or mail address.’? Some members 
on leave or in the services have reported the permanent address. Some have reported the 
‘on leave” or ‘“‘APO” address, as the mailing address. The addresses given are the last 
reported addresses. Whenan address is known to be in error, it is followed by (last address) . 
Changes in addresses or errors in names, titles or addresses, should be reported to 
the Secretary. 
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Astrachan, Asso. Prof. Max Ph.D. (Brown) Antioch College, Yellow Springs, Ohio 
Auner, George B.A. (Western Reserve) 12418 Iowa Ave., Cleveland 8, Ohio 
Bachelor, Robert W. M.B.A. (Washington) American Bankers Assn., 22 East 40th St., 
New York City 16 
Bacon, Asso. Prof. Harold M. Ph.D. (Stanford) Stanford Univ., Stanford, Calif. Boz 
1114 
Bailey, Arthur L. B.S. (Michigan) Stat., American Mutual Alliance, 60 E. 42nd St., 
New York City, P. O. Box 278, Ramsey, N. J. 
*Baker, Asst. Prof. George A. Ph.D. (Illinois) Asst. Prof. of Math. and Asst. Stat., 
Exp. Sta., Coll. of Agri., Univ. of California, Davis, Calif. 
Baldwin, Woodson W. S.B. (Mass. Inst. Tech.) Capt., Ord. Dept., USA Office of Field 
Dir. of Ammunition Plants, 3629 Lindell Blvd., St. Louis 8, Mo. 3745 Lindell Blvd. 
Bales, R. P. B.A.Sc. (Toronto) Tech. Supt., Dominion Rubber Co., St. Jerome, Que. 
Canada 
Bancroft, Asst. Prof. Theodore A. Ph.D. (Iowa State) Iowa State Coll., Math. Dept., 
Ames, Iowa 
Bafios, Olegario Fernandez D.C. (Madrid) Catedratico, Univ. of Madrid, Calle Lopez- 
deHoyos 7, Spain 
Barkan, Herbert M.A. (Columbia) Econ. Analyst, 50 8th Ave., Brooklyn, N. Y. 
Barnes, Jarvis M.A. (George Peabody Coll. for Teachers) Atlanta Board of Educ. 
14th Floor, City Hall, Atlanta, Ga. 
Barnes, Prof. John L. Ph.D. (Princeton) Chairman, Dept. of Applied Math., Tufts 
Coll., Medford 55, Mass., 16 Ardley Road, Winchester 
Barr, Prof. Arvil S. Ph.D. (Wisconsin) Univ. of Wisconsin, Madison, Wis. 
Barral-Souto, Prof. José Sc.D. (Buenos Aires) Univ. of Buenos Aires, Buenos Aires, 
Argentina, Cordoba 1459 
*Bartky. Asso. Dean Walter Ph.D. (Chicago) Univ. of Chicago, Chicago, Il. 
Bartlett, Maurice D.Se. (London) Univ. Lecturer, Cambridge, 137 Chesterton Road, 
Cambridge, Eng. 
Bassford, Horace R. B.A. (Trinity) Vice Pres. and Actuary, Metropolitan Life Ins. Co., 
1 Madison Ave., New York City 10 
*Baten, Prof. William D. Ph.D. (Michigan) Prof. of Math. Mich. State Coll. and Res. 
Prof. Mich. Agri. Exp. Sta., Mich. State Coll., E. Lansing, Mich. 41/1 Marshall St. 
Bates, Prof.O. Kenneth Sc.D. (Mass. Inst. Tech.) Prof. of Math. and Head of Dept., 
The St. Lawrence Univ., Canton, N. Y. 
Battin, Asst. Prof. Isaac L. A.M. (Swarthmore) Drew Univ., Madison, N. J. 14 Glen- 
wild Rd. 
Beall, Geoffrey Ph.D. (London) Res. Asso., Inst. of Paper Chemistry, Appleton, Wis. 
Bechhofer, Robert E. B.A. (Columbia) Stat., The Kellex Corp., 233 Broadway, New 
York City. 181 Degraw Avenue, Teaneck, N. J. 
Becker, Harold W. Elec. Inst., Mare Is. Training School, Bldg. 146, Mare Island, Calif. 
1426 Amador, Vallejo 
Beckstead, Gordon L. Lt.(j.g.) USNR, Weather Central NAS, San Diego, Calif. 
Beebe, Gilbert W. Ph.D. (Columbia) Lt. Sn C., AUS Control Div., Office of the Sur- 
geon General, 1818 H St., N.W., Washington, D. C. 
Been, Richard O. M.A. (George Washington) Sr. Agri. Econ., US Bur. of Agri. Econ., 
3433 South Bldg., Washington, D. C. 
Bellison, Harold R. S.M. (Mass. Inst. Tech.) Industrial Eng., War Dept., Ord. Dept., 
Pentagon Bldg., Arlington, Va. 3416 B St., S.E., Washington 19, D.C. 
Belz, Asso. Prof. Maurice H. M.A. (Melbourne) Univ. of Melbourne, Carlton, N. 3, 
Victoria, Australia 
Bennett, Prof. Albert A. Ph.D. (Princeton) Brown Univ., Providence, R. I. 
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Bennett, Blair M. M.A. (Columbia) Asso. Math. Nat. Bur. of Standards, Washington, 
D.C. 1410 M St., N.W. 

Bennett, Carl A. M.A. (Michigan) Chem., Slinton Engineer Works, Tenn. Eastman 
Corp., Knoxville 5, Tenn. 237 West Tenn., Oak Ridge 

Berger, Richard M.A. (Columbia) Asso. Stat., Office of Price Admin., Washington, 
D.C. Lt.(j.g.) USNR Communication Officer, USS Gainard (00706) c/o FPO San 
Francisco, Calif. 25 Rugby Road, Rockville Centre, N. Y. 

Berkson, Joseph D.Sc. (Johns Hopkins) Col. M.C., US Army, AAF, Office of the Air 
Surgeon, Washington 25, D. C. 

Berman, Abraham J. M.A. (Brooklyn) Stat., N. Y. State Dept. of Labor, 80 Center St., 
New York City, 1460 College Ave., Bronz, N. Y. 

Berwick, Leo A.B. (New York) Capt., A.C. Asst. to Surgeon Stat. Unit of Psych. 
Sect., Hdq. AFTRC, T & P Bldg., Fort Worth 2, Texas 

Bickerstaff, Asst. Prof. Thomas A. M.A. (Mississippi) Univ. of Mississippi, State 
College, Miss. 

Bigelow, Julian H. 401 W. 118th St., New York City 27 

Birnbaum, Asst. Prof. Z. William Ph.D. (Lwow) Univ. of Washington, Seattle, Wash. 

Blackadar, Walter L. B.A. (McMaster) Asso. Actuary, Equitable Life Assurance So- 
ciety of the US, 393 7th Ave., New York City 1 

Blackburn, Asso. Prof. Raymond F. Ph.D. (Pittsburgh) Head, Dept. of Stat., Univ. of 
Pittsburgh, Pittsburgh 13, Pa. 

Blackwell, Asst. Prof. David Ph.D. (Illinois) Math. Dept. Howard Univ., Washington, 
D. C. 

Blake, Archie Ph.D. (Chicago) Ballistic Res. Lab., Aberdeen Proving Gd. Boz 86, 
Aberdeen, Md. 

Blanche, Ernest E. Ph.D. (Illinois) Foreign Econ. Admin.,’515-22nd St., N.W., Wash- 
ington, D. C., 9409 Montgomery Ave., N. Chevy Chase, Md., APO 24741 c/o Postmas- 
ter, New York City 

*Bliss, Asso. Prof. Chester I. Ph.D. (Columbia) Biometrician, Conn. Agri. Exp. Sta., 
Lecturer in Biometry, Yale Univ., New Haven, Conn. 

Bloom, Rose B.A. (Hunter) 1275 SCSU, Fort Jay, N. Y. 

Bloom, Royal F. M.A. (Minnesota) Lt. Comdr., 4717 Arlington Annex Navy Dept., 
Washington, D. C. 

Blommers, Paul J. Ph.D. (Iowa) Univ. Examiner and Registrar, 114 Univ. Hall, State 
Univ. of Iowa, Iowa City, Iowa 

Boddie, John B., Jr. Chief: Budget Formulation, Foreign Econ. Admin., Washington, 
D. C. 2628 Tunlaw Rd., N.W. 

Bonis, Austin J. B.S. (C.C.N.Y.) Major, G-I War Dept. Gen. Staff, Washington, D. C. 
2500 Que St., N.W. 

Bonnar, Robert U. M.S. (Washington) 2/9 Jefferson St., Vallejo, Calif. 

Boozer, Mary E. A.M. (Chicago) Stat. Res., Virginia State Planning Dc., 301 Finance 
Bldg., Richmond 19, Va. 

Borland, James M.A. (Indiana) Capt., Ex. Officer, Inspection Office, Pine Bluff Ar- 
senal, Ark. 

Bowen, Earl K. M.A. (Boston) Instr. in Math., Northeastern Univ., 360 Huntington 
Ave., Boston, Mass. 246 Union St., Norwood 

Boschan, Paul Ph.D. (Vienna) Econ. Inst., 500 Fifth Ave., New York City. 104 W. 
40th St., New York City 18 e 

Bower, Oliver K. Ph.D. (Illinois) Associate, Univ. of Ill., Urbana, Ill. 505 W. John 
Champaign 

tBowker, Albert H. S.B. (Mass. Inst. Tech.) Student, Columbia Univ., New York City 
27, 22 Arden Place, Yonkers 3, N. Y. 
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Brady, Dorothy S. Ph.D. (California) Home Ec. Specialist, Bur. of Home Econ., Wash- 
ington, D.C. 4573 Fulton St., N.W. 

Brandt, Alva E. Ph.D. (Iowa State) Chief, Erosion Control Practices Div., Soil Conser- 
vation Service, USDA, US Dept. of Agri., Washington, D. C., Box 135, Route 8, 
Vienna, Va. 

Brearty, Charles R. B.S. (California) Major, US Army, Signal Corp. Inspection Agency, 
12th Floor, Public Ledger Bldg., 6th and Chestnut Sts., Philadelphia 6, Pa. 

Breden, Robert E. B.S. (Kansas) Personnel Tech., Personnel Res. Dept., The Proctor 
& Gamble Co., 6th and Main Sts., Cincinnati, Ohio 

Bridger, Clyde A. M.S. (Oregon) Inst. Math., Univ. of Utah, Salt Lake City 1, Utah. 
836 Douglas Street, Salt Lake City 2 

Brier, Glenn A.M. (George Washington) Meteorologist, US Weather Bur., Washington 
2, D. C. 

Brixey, Asso. Prof. John C. Ph.D. (Chicago) Univ. of Oklahoma, Norman, Okla. 927 
S. Pickard St., Norman 
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